+ Reply to Thread
Page 1 of 2 12 LastLast
Results 1 to 10 of 11

Thread: Potential, harmless typo in ANSI-UTF.VDM

  1. #1

    Potential, harmless typo in ANSI-UTF.VDM

    I assume that I have found a totally harmless typo in ANSI-UTF.VDM. Specifically, line 210 says that

    B5 03 EE Greek small letter gamma

    Actually, in the macro the character is replaced by the Vedit OEM character with Hexadecimal value EE. I believe that the word "gamma" should be replaced by the word "epsilon"

    To check that my assumption is correct, I have visited the www.unicode.org/charts/PDF
    website and downloaded the file 0307.pdf. This file, indeed, has the icon of a "Greek small letter epsilon" and it has the code 03B5 under it. In other words, the same code as your code read backwards.

    This change also would make the macro compatible with the Vedit {Misc, ASCII table ..} menu command. For the hexadecimal value EE, this table displays the same icon as your line 210 . In other words, it displays an "epsilon".

    I also checked it "Code page 437" on microsoft.msdn.com and under unicode 03B5 it als has an "epsilon".

    In any case, a big thank you for ANSI-UTF.VDM. I like it very much! In fact, I do prefer your table in this macro to the one of the Vedit menu command. My reason is that in your table the value of the Font_Charset parameter is explicit, while in the Vedit menu command it is not.

    Incidentally, I would love to add the line Char Set = Vedit ANSI or Char Set = Vedit OEM,
    whichever the case to the table displayed by this Vedit menu command. However, I do not know how to do it. In fact, I do not know whether this is an internal Vedit command or Vedit is executing this command via a macro.

    Thanks again, I have learnecd quite a bit from your macro.

    Last edited by rejto; September 3rd, 2011 at 08:30 PM.

  2. #2
    Christian and I generated this macro a few years ago, but the translation tables were produced by me.
    I have a file dated 2004 which was the source of the table. This contains the same error.

    The actual translations should be correct (they were generated programatically) but the descriptions were added manually.
    After all this time I can't remember the details, but suspect a simple editing error as U03B3 Greek small letter gamma does not exist in CP 437.

  3. #3
    Join Date
    Aug 2011
    Ann Arbor, MI
    I don't fully understand this, but since Ian also thinks it should read "epsilon", I have changed the macro.

  4. #4
    Are the macros ansi-utf.vdm and utf-ansi.vdm needed any more?
    Utf-conv.vdm can replace them both.

    But Utf-conv.vdm has the same error in the comment, as expected.

    By the way, the menu still says "Unicode to ASCII" and "ASCII to Unicode".
    It should be "Unicode to ANSI" and "ANSI to Unicode".
    (And if you use utf-conv.vdm to implement these menu commands, you can remove the "(UTF-16)" part from the menu text.)


  5. #5
    I only did some small tests but yes, there seems to be some redundancy in the {Edit, Translate} menu.
    The three Unicode related items should be reducable to one (the third one).

  6. #6
    Or, in case someone prefers direct conversion, the first two items could just call the labels FROM_UTF and TO_UTF in utf-conv.vdm.
    But these items are not really needed, since the macro usually can automatically detect the direction.
    In any case, the old macros are not needed any more.

  7. #7
    Quote Originally Posted by pal View Post
    Or, in case someone prefers direct conversion, the first two items could just call the labels FROM_UTF and TO_UTF in utf-conv.vdm.
    But these items are not really needed, since the macro usually can automatically detect the direction.
    In any case, the old macros are not needed any more.
    I am not sure that vedit is not calling the entry points in utf-conv.vdm. It is not possible to determine directly, although it wouldn't be that difficult to work out (by modifying or deleting macros).

    It is (almost) impossible to distinguish between UTF-8 and ANSI so the macro can not automatically detect the direction.

    In actual fact there are a lot of other options which can not be directly called - they would need separate menu items.

    I might add, that as author of these macros, I do not actually use the published versions.
    I have my custom macros to convert UTF-8 <-> UTF-16LE and ANSI <-> UTF-16LE - these cover virtually all my needs, and do not ask silly questions every time I run them. Occasionally I need to run 2 macros in sequence e.g. UTF-8 <-> UTF-16LE <-> ANSI (or ASCII on the very rare occasions I need to handle OEM).

    My approach to macros is much the same as c programming or UNIX shell programming.
    I prefer functions which do one task, and link them together to perform more complex tasks.
    This is more flexible and easier to debug.

  8. #8
    I agree that this can all be done with just the utf-conv.vdm macro. However, like Ian, I don't like the FROM-UTF and TO_UTF labels as they still display dialog boxes.
    However, I like the UTF_ANSI and ANSI_UTF16 labels which skip the dialog boxes.

    Therefore, I plan to:

    * Rename the "Unicode (UTF-16) to ASCII" function to "Unicode to ANSI" and implement it with CallF(122,"utf-conv","UTF_ANSI")

    * Rename the "ASCII to Unicode (UTF-16)" function to "ANSI to Unicode (UTF-16LE) and implement it with CallF(122,"utf-conv","ANSI_UTF16")

    * Add another function ""ANSI to Unicode (UTF-8) and implement it with CallF(122,"utf-conv","ANSI_UTF8")

    * Leave the "Unicode (UTF-16 and UTF-8)" menu item

    From my brief tests, it appears the "UTF_ANSI" label can tell the difference between UTF-8 and UTF-16 and therefore only one function is needed in this direction.

    It appears these three functions could handle most conversions, with the dialog item handling the rest.

    If this sounds good, I will implement it ASAP.

  9. #9
    Sounds OK.

    I have soma pathological test cases which can cause UTF-16 to be confused with UTF-8 or ANSI, but these are deliberately constructed, and unlikely to occur in practice.
    At any event most UTF files users will encounter have BOM.

  10. #10

    Thank you for this interesting discussion.

    I vaguely remember that I got interested in the UTF ASCI conversion when I was editing a copy of my the Win XP x64 Registry
    in Vedit. Somehow, I used Vedit to convert the UTF characters to ASCII characters. However, my Win XP x64 system crashed and my records got lost.

    Finally, I replaced the crashed Win XP x64 system by a Win 7 x64 system. In short, I am looking for a safe way to edit my Win 7 x64 Registry.

    Any suggestion would be appreciated.


+ Reply to Thread

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts