PDA

View Full Version : How do we display CYRILLIC character sets in VEDIT?



TychoAussie
January 7th, 2015, 12:12 PM
I have received text files from my research partners in Russia that contain characters in cyrillic character set. I need to figure out how to display them in cyrillic rather than the odd & weird characters that microsoft substitutes in their place.

I need to display the cyrillic characters in order to properly translate them!

Does anyone have any ideas on how to do this?

Scott Lambert
January 15th, 2015, 12:16 PM
Hi,

Best idea, I can come up with is a macro that would translate the Cyrillic characters into a readable code.

Take a look at http://en.wikipedia.org/wiki/Russian_alphabet and you see the uppercase letters of the Russian alphabet. Take the character that comes after the A, the character looks to me to be a combo of a F and a S, so the proposed macro would translate it to something like [FS]

The first character on the second row would translate to [E..] The next character would translate to [-KK] The minus sign would mean character should be mirrored.

The key to writing such a macro is to figure out what ascii value Vedit is seeing when it encounters a Cyrillic character.

Scott

pal
January 16th, 2015, 10:16 AM
Are those characters 8-bit characters from cyrillic code page, or are they Unicode (UTF8 or UTF16) characters?

You need a font that contains cyrillic characters, and you need to have cyrrillic code page selected in Windows.
Then, any 8-bit cyrillic characters would be displayed directly on Vedit without you to do anything.

However, if the characters are Unicode, you would need to convert them to 8-bit.
Unfortunately, the Unicode conversion of Vedit only support ANSI (ISO Latin-1) and OEM character sets.
But Notepad should be able to show the Unicode characters directly, as long as you have a suitable font.

pal
January 16th, 2015, 10:32 AM
Hi,
Best idea, I can come up with is a macro that would translate the Cyrillic characters into a readable code.
...
Scott

That could be one possible way to do it.
But in that case, I think it would be better to transliterate each characters into Latin alphabet using one of the transliteration standards.
See http://en.wikipedia.org/wiki/ISO_9

So, for example Славься, Отечество наше свободное would transliterate into Slavʹs, Otečestvo nae svobodnoe.

ian binnie
January 16th, 2015, 06:50 PM
I have received text files from my research partners in Russia that contain characters in cyrillic character set. I need to figure out how to display them in cyrillic rather than the odd & weird characters that microsoft substitutes in their place.

I need to display the cyrillic characters in order to properly translate them!

Does anyone have any ideas on how to do this?

I wrote the Vedit Unicode translation, and did did some work on fonts. There are no fonts that Vedit can use to display Cyrillic.

Your best bet is to use a program with native Unicode support. I have done this, and used Google translate, to work on some Cyrillic in the past.

TonyGDI
January 16th, 2015, 09:49 PM
Hi Guys, I spoke with TychoAussie regarding this requirement. The best, low tech, recommendation I could make to him was to open his data file in firefox (file:///c:\my_download_folder\INFO.txt) and switch the charset to "Cyrillic (Windows)." This enabled him to copy/paste the text into google translate.

This option worked for him and required minimal effort.

There are other alternatives, given the vast array cyrillic tools on the internet.

rejto12
February 26th, 2017, 03:57 PM
Are those characters 8-bit characters from cyrillic code page, or are they Unicode (UTF8 or UTF16) characters?
You need a font that contains cyrillic characters, in Winand you need to have cyrrillic code page selected dows.
Then, any 8-bit cyrillic characters would be displayed directly on Vedit without you to do anything.

However, if the characters are Unicode, you would need to convert them to 8-bit.
Unfortunately, the Unicode conversion of Vedit only support ANSI (ISO Latin-1) and OEM character sets.
But Notepad should be able to show the Unicode characters directly, as long as you have a suitable font.


Pauli,

I am re-reading your message and find it amazing. Specifically, I find the part,

"You need a font that contains cyrillic characters, in Win and you need to have cyrrillic code page selected in windows.
Then, any 8-bit cyrillic characters would be displayed directly on Vedit without you to do anything."

absolutely amazing. In other words, I have tried to display the cyrillic code page in Vedit, but I could not do it. Do, I understand your message correctly ? It implies that do not need to try to display the cyrillic code page in Vedit. Do it in windows.

So, the next question is how to do it in windows. Can you help me ?

Thanks,

-peter

rejto12
February 27th, 2017, 01:57 PM
Pauli,

I am re-reading your message and find it amazing. Specifically, I find the part,

"You need a font that contains cyrillic characters, in Win and you need to have cyrrillic code page selected in windows.
Then, any 8-bit cyrillic characters would be displayed directly on Vedit without you to do anything."

absolutely amazing. In other words, I have tried to display the cyrillic code page in Vedit, but I could not do it. Do, I understand your message correctly ? It implies that do not need to try to display the cyrillic code page in Vedit. Do it in windows.

So, the next question is how to do it in windows. Can you help me ?

Thanks,

-peter


Pauli,

I did some more thinking and I would like to give you an informal report. I googled "code page" and ended up at ss64.com.

One of the things that I learned was that CHCP.com is a windows command. Since I like to call win commands from Vedit, I added the

23
Display Active code page
System("chcp", DOS)

command to my diag.mnu file. My Win 7, beautifully replied with a

Active code page is 437

message. Then I experimented with commands like,

System("chcp 866",DOS)

Again, I got the message Active code page 8666.

I also had my Hungarian Dictionary open in Vedit and I was watching an accented word. No change at all. This was suspicious, so issued a Display Active code page again command.

Now came the confirmation of my suspicion. My code page is the default 437. In short, I did not succeed in setting a new page code.

I also went back to Tony's message in this thread. I have a hunch that he recommended to do the code page setting in, I believe, Firefox, because it does a better job than windows.

At this point I am taking a break.


Thanks for all your help.

-peter

rejto12
February 28th, 2017, 07:17 PM
Pauli,

1.:
I forgot to ask you what is your code page. Furthermore, can you change it in Windows to 437 ?

2.:
I take your email correspondence is bilingual. In other words, if you get a Swedish e-mail you answer it in Swedish and if you get an English email you answer it in English. I have a hunch that this is a code page problem and I wonder who handles it
for you ?


3.:
Do you use an English Spell checker and do you use a Swedish spell checker/dictionary ?

I remember many, many years ago I asked the corresponding question from our German linguist fellow Vedit user, Heberlein. His answer was : No, I do not use a German spell checker/dictionary.
Furthermore, "my reason is that the German Language uses a lot of declinations and for such a language these aids are not useful. Never the less, he generously helped me with my English Dictionary problem.

I know that you are busy, very busy. At the same time, I would appreciate your looking at the Swedish-English dictionary at WinEdt.org and let me know your reactions. I am an enthusiastic user of their Hungarian-English dictionary.


Thanks for all your help.


-peter