PDA

View Full Version : Match command and the |W pattern code



Scott Lambert
April 9th, 2013, 12:12 PM
Hi,

Okay, in an empty buffer start on line 1, you type:

dog
cat
boy
girl
book

each word starting on column 1.

Then you write a simple macro & run it in the buffer with the words you just typed.

#10=0
begin_of_file
repeat(all) {
match("|W")
if(return_value==0) {break}
#10=#10+1
goto_line(cur_line+1)
}

Now, do you think the loop will break when it encounters a whitespace in column 1?

The answer is no. You have an endless loop, even though return_value of 0 should mean a successful match.

For the loop to break, you need:

#10=0
begin_of_file
repeat(all) {
match("|W")
if(return_value==2) {break}
#10=#10+1
goto_line(cur_line+1)
}

The above works, although you would expect to test for 0. It is either a whitespace character
or not.

Now in another empty buffer, press enter 5 times and on line 6 starting in column 1, put the word: cat

Now modify our macro like so:

#10=0
begin_of_file
repeat(all) {
match("|!|W")
if(return_value==0) {break}
#10=#10+1
goto_line(cur_line+1)
}

So the macro should break loop when it encounters a line that does not begin with a whitespace.

It does not do so. It does not appear to do anything. #10 always remains 0.

However if you have the following:

#10=0
begin_of_file
repeat(all) {
match("|!|W")
if(return_value==2) {break}
#10=#10+1
goto_line(cur_line+1)
}

The macro works, as you would expect.

So the question is why when using match with |W, a successful match does not generate a
return_value of 0?

(A related question is what would cause a return_value of 3? See help file.)

pal
April 10th, 2013, 06:56 AM
Scott,

A successfull match does return zero.

Your first macro works correctly. It does not break out because there is no whitespace at the beginning of line.
Try adding a whitespace at the beginning of one line in the text.
The second example text does not contain whitespace either. An empty line (when you just press Enter) does not contain whitespace. Whitespace is one or more Space or Tab characters.

BTW, it is not necessary to check the variable Return_Value, you could check the return value of Match() directly.
To avoid endless loop, replace the goto_line(...) with Line(1, ERRBREAK).
So the first macro would become:


BOF
repeat(ALL) {
if (Match("|W")==0) {break}
Line(1, ERRBREAK)
}

Of course, this particular task could be done simplier with just one command:

Search("|<|W", BEGIN)

According to the documentation, Match() returns 3 if the match fails and pattern codes were used. In this case, it is not possible to know if the text was lexically smaller or larger than the parameter.

Scott Lambert
April 10th, 2013, 09:30 AM
Thanks Pauli!

My error was thinking that whitespace meant all characters that show nothing on the screen in the usual viewing mode. This included the end of line characters, tabs, space, null character, etc. Now I know it is just space and tab.

Scott

pal
April 15th, 2013, 05:06 AM
You can match extended whitespace with pattern |X. It matches one or more spaces, tabs, CR or LF characters.
To match other control characters, use |C or |K.

Scott Lambert
April 15th, 2013, 10:38 AM
Thanks Pauli!

The reason, I started this thread is in writing Keytran.vdm, I had problems due to my misunderstanding of |W in getting the loop in the macro to break when reaching a blank character in column 1

More research on my part indicates that an unused line begin with ascii 26. If you open a new buffer, drop to command mode and do cur_char, you get ascii 26.

So it seem a fairly bullet proof way to test for a blank character in column 1 is: |{|026,|L,|W}

Scott

Howard Goldstein
April 15th, 2013, 01:54 PM
Thanks Pauli!

The reason, I started this thread is in writing Keytran.vdm, I had problems due to my misunderstanding of |W in getting the loop in the macro to break when reaching a blank character in column 1

More research on my part indicates that an unused line begin with ascii 26. If you open a new buffer, drop to command mode and do cur_char, you get ascii 26.

So it seem a fairly bullet proof way to test for a blank character in column 1 is: |{|026,|L,|W}

Scott
ASCII 26 is the end-of-buffer mark. You can use if(at_eob) to determinine whether you're at the end of the buffer. Also, at_eol will return true if you're on a completlely empty line.

mrvedit
April 15th, 2013, 09:22 PM
ASCII 26 is the end-of-buffer mark. You can use if(at_eob) to determinine whether you're at the end of the buffer. Also, at_eol will return true if you're on a completlely empty line.

Right. Cur_Char() needs to return something at EOF and the Ctrl-Z (026) is an ancient EOF value from the 70's and 80's.
If is is a problem, I could change it to e.g. -1, but it has worked this way for a long time.
Ted.

Howard Goldstein
April 16th, 2013, 06:43 PM
Right. Cur_Char() needs to return something at EOF and the Ctrl-Z (026) is an ancient EOF value from the 70's and 80's.
If is is a problem, I could change it to e.g. -1, but it has worked this way for a long time.
Ted.
Not a problem for me. In fact, I rather like the use of ASCII 26 as it brings back fond memories of CP/M and Z-System.