UPDATE for U+30FB (KATAKANA MIDDLE DOT):
As @sergey-tachenov point out that issue is related to U+30FB
(KATAKANA MIDDLE DOT), then it needs to be solved. For this reason, I would like to share some previous project experience and suggestions.
Suggestion-1:
One of my projects, we are making some manual for project. The manual was in various languages. There we got same type of issues. We used Lotus Notes
. On that case, we have made some filters which changed those characters or glyphs to dot. After that lotus notes create folder and file name which are using later as path. So that problem was solved by this way. If you have that type of option, then you can fix easily.
Suggestion-2:
Various people are facing same type of issue. So they have tried in various ways.
Some saying,
- replacing it with dot(.) solved the issue.
KATAKANA MIDDLE DOT (・)
is s a double-width character. If you want to
use the Katakana (Japanese) mid dot, consider using the HALFWIDTH
KATAKANA MIDDLE DOT instead.
- switched to the regular bullet and it works fine.
If you see twitter-text
, they have made solution for KATAKANA MIDDLE DOT (・)
. See in github repo
Resource Link
Katakana Middle Dot issue solved in Twitter-Text
But attom developer chrissimpkins stated that below
I can confirm that we do not have a Katakana middle dot glyph (U+30FB)
in the regular Hack font. There is a middle dot (U+00B7) that will
have the appearance that you are after here. I can confirm that the
U+00B7 glyph has the same fixed width spacing as the rest of the
regular set (and all other variant sets).
Resource Link: https://github.com/atom/atom/issues/9115
First, I want to share with you that dot or period(.) is ASCII character. So dot(.) is not the issue. Character encoding and Server setting may be the issue.
URLs can only be sent over the Internet using the ASCII character-set. If a URL contains characters outside the ASCII set, the URL has to be converted.
SMB URL will be like below:
smb://[[[domain;]username[:password]@]server[:port]/[[share/[dir/]file]]][?param=value[param2=value2[...]]]
jCIFS can also address servers, and workgroups.
Important: all SMB URLs that represent workgroups, servers, shares, or
directories require a trailing slash '/'.
When using the java.net.URL
class with 'smb://'
URLs it is necessary to first call the static jcifs.Config.registerSmbURLHandler();
method. This is required to register the SMB protocol handler.
The userinfo component of the SMB URL (domain;user:pass) must be URL
encoded if it contains reserved characters. According to RFC 2396
these characters are non US-ASCII characters and most meta characters
however jCIFS will work correctly with anything but '@' which is used
to delimit the userinfo component from the server and '%' which is the
URL escape character itself.
Character Checking and Setting
Then you have to know which charset you are using. By using following code, you can get:
System.out.println(Charset.defaultCharset());
or you can give command
$ testparm -v | grep dos
shows that Samba's default OEM encoding
CIFS uses either UTF-16LE
or a default codepage. The default
codepage used by JCIFS is Cp850 or US_ASCII.
In jCIFS you can set it UTF-8 and check:
System.setProperty("jcifs.encoding", "UTF8");
Then for japanese locale, you can try
System.setProperty("jcifs.encoding", "Shift_JIS");
share names, passwords,
and in some cases file and directory names that contain non ASCII
characters may not be handled properly.
By default this property is Cp860 which is MS-DOS Latin1.
Note: The Cp860 charset converter is located in jre/lib/charsets.jar
which AFAIK is only supported by the internationalized version of
Sun's JRE. If Cp860 is not available an exception will occur. To avoid
this exception you can set jcifs.encoding to ASCII but share names and
passwords with non-ASCII characters will not be processed correctly.
To determine if jCIFS is properly processing these characters create a
share that contains non-ASCII characers (e.g. Grüße) and then try to
list that share with the ListFiles.java example program.
Setting Client Properties with Japanese
For Japanese language, you could try setting jcifs.encoding = Shift_JIS
The following tables show the Japanese
encoding sets supported by J2SE 5.0. The canonical names used by the new java.nio APIs are in many cases not the same as those used in the java.io and java.lang APIs.
----------------------------------------------------------------------------------------------
|Canonical Name for | Canonical Name for java.io | Description |
| java.nio API | and java.lang API | |
----------------------------------------------------------------------------------------------
| EUC-JP | EUC_JP | JISX 0201, 0208 and 0212, EUC encoding |
| | | Japanese |
----------------------------------------------------------------------------------------------
| ISO-2022-JP | ISO2022JP | JIS X 0201, 0208, in ISO 2022 form, |
| | | Japanese |
----------------------------------------------------------------------------------------------
| Shift_JIS | SJIS | Shift-JIS, Japanese |
----------------------------------------------------------------------------------------------
| windows-31j | MS932 | Windows Japanese |
----------------------------------------------------------------------------------------------
| x-euc-jp-linux | EUC_JP_LINUX | JISX 0201, 0208, EUC encoding Japanese |
----------------------------------------------------------------------------------------------
| x-eucJP-Open | EUC_JP_Solaris | JISX 0201, 0208, 0212, EUC encoding |
| | | Japanese |
----------------------------------------------------------------------------------------------
| x-IBM33722 | Cp33722 | IBM-eucJP - Japanese (superset of 5050) |
----------------------------------------------------------------------------------------------
| x-IBM930 | Cp930 | Japanese Katakana-Kanji mixed with 4370 |
| | | UDC, superset of 5026 |
----------------------------------------------------------------------------------------------
| x-IBM939 | Cp939 | Japanese Latin Kanji mixed with 4370 |
| | | UDC, superset of 5035 |
----------------------------------------------------------------------------------------------
| x-IBM942 | Cp942 | IBM OS/2 Japanese, superset of Cp932 |
----------------------------------------------------------------------------------------------
| x-IBM943 | Cp943 | IBM OS/2 Japanese, superset of Cp932 |
| | | and Shift-JIS |
----------------------------------------------------------------------------------------------
I have shared some full code example for JCIFS. You could make a try
- Copying files over network shared folder using Java
- Copying the resources to and from windows network using Java
- Java Tutorial – Using JCIFS to copy files to shared network drive
using username and password
Here's an example to retrieve a file:
import jcifs.smb.*;
jcifs.Config.setProperty( "jcifs.netbios.wins", "192.168.1.220" );
NtlmPasswordAuthentication auth = new NtlmPasswordAuthentication("domain", "username", "password");
SmbFileInputStream in = new SmbFileInputStream("smb://host/c/My Documents/人事部/要員・コスト管理課/somefile.txt", auth);
byte[] b = new byte[8192];
int n;
while(( n = in.read( b )) > 0 ) {
System.out.write( b, 0, n );
}
You can also read/write, delete, make directories, rename, list contents of a directory, list the workgroups/ntdomains and servers on the network, list the shares of a server, open named pipes, authenticate web clients ...etc.
The SmbFile, SmbFileInputStream , and SmbFileOutputStream classes are
analogous to the File, FileInputStream, and FileOutputStream classes
By using FileInputStream and FileOutputStream, Code will be like below:
SmbFile[] files = getSMBListOfFiles(sb, logger, domain, userName, password, sourcePath, sourcePattern);
if (files == null)
return false;
output(sb, logger, " Source file count: " + files.length);
String destFilename;
FileOutputStream fileOutputStream;
InputStream fileInputStream;
byte[] buf;
int len;
for (SmbFile smbFile: files) {
destFilename = destinationPath + smbFile.getName();
output(sb, logger, " copying " + smbFile.getName());
try {
fileOutputStream = new FileOutputStream(destFilename);
fileInputStream = smbFile.getInputStream();
buf = new byte[16 * 1024 * 1024];
while ((len = fileInputStream.read(buf)) > 0) {
fileOutputStream.write(buf, 0, len);
}
fileInputStream.close();
fileOutputStream.close();
} catch (SmbException e) {
OutputHandler.output(sb, logger, "Exception during copyNetworkFilesToLocal stream to output, SMP issue: " + e.getMessage(), e);
e.printStackTrace();
return false;
} catch (FileNotFoundException e) {
OutputHandler.output(sb, logger, "Exception during copyNetworkFilesToLocal stream to output, file not found: " + e.getMessage(), e);
e.printStackTrace();
return false;
} catch (IOException e) {
OutputHandler.output(sb, logger, "Exception during copyNetworkFilesToLocal stream to output, IO problem: " + e.getMessage(), e);
e.printStackTrace();
return false;
} finally {
OutputHandler.output(sb, logger, "Exception during copyNetworkFilesToLocal stream to output, IO problem: " + e.getMessage(), e);
e.printStackTrace();
return false;
}
}
Credit goes to @man called haney
Resource Link: How to copy file from smb share to local drive using jcifs in Java?
Precaution-1:
For more cautions for HTTPS and Tomcat users,
In most cases URLs running over HTTP work fine, but not when using HTTPS (i.e. over SSL). This usually results in Unicode (non-ASCII) characters in an HTTPS URL appear incorrect in the URL, and the served page contains numerous errors
This occurs when the useBodyEncodingForURI="true"
flag is not defined in the HTTPS connector definition in conf/server.xml
of the Apache Tomcat application server running JIRA. This flag is set as such by default in 'recommended' distribution installations of JIRA.
However, in JIRA WAR setups, this might not be the case. Hence, ensure that the useBodyEncodingForURI="true"
flag is included in the following element of the conf/server.xml
file of your Apache Tomcat installation running JIRA:
<Connector port="8443" maxHttpHeaderSize="8192"
maxThreads="150" minSpareThreads="25" maxSpareThreads="75"
enableLookups="false" disableUploadTimeout="true"
acceptCount="100" scheme="https" secure="true"
clientAuth="false" sslProtocol="TLS" useBodyEncodingForURI="true" />
After specifying the useBodyEncodingForURI="true"
in all connector definitions (i.e. both the HTTP and the HTTPS connectors)
, as described in the 'Modifying Tomcat server.xml' section of the Installing JIRA on Tomcat 6.0 or 7.0 documentation
Resource Link:
How to Get Unicode 'non-ASCII' Characters in HTTPS URL to Appear Correctly
For NON-ASCII character, you can go through
- (Please) Stop Using Unsafe Characters in URLs
- Can I use non-ASCII characters in URLs?
- Is it advisable to have non-ascii characters in the URL?