Page 1 of 1

Wild about text searches

Posted: Mon Oct 30, 2023 8:05 pm
by richmond62
So . . . someone posted over on the Use-List this:
I am doing the above and struggling with an oddity that I can’t find guidance on on Livecode or wider wildcard stuff

A simple example is I am searching text messages for 'with you' or 'with u’

so I use the wildcard form

*with [you,u]*

That finds all examples of both just fine. However, it also finds ‘with unlimited cheese’ and 'with us’, ‘with yours’ etc. so I want a space after both u

When I put two spaces inside the square brackets after each string, the search still works but spaces seem to be ignored (so still finds the above resamples I don’t want).

If I put a single space after the brackets the first bracketed string is ignored and the filter only finds “with u “
Now I have NEVER played around with WILDCARD expressions, not even with HyperCard before the dinosaurs were around.

BUT, this peaked my
interest, and here is my way of doing this:
-
Screenshot 2023-10-30 at 21.03.31.png
-
The script in the button:

Code: Select all

on mouseUp
   put empty into fld "fPERKED"
   put 1 into LYNE
   repeat until line LYNE of fld "lf" is empty
      put line LYNE of fld "lf" into PROCK
      if matchText(PROCK,"with u$") then
         put (PROCK & cr) after fld "fPERKED"
      end if
      if matchText(PROCK,"with you$") then
         put (PROCK & cr) after fld "fPERKED"
      end if
      add 1 to LYNE
   end repeat
end mouseUp

Re: Wild about text searches

Posted: Mon Oct 30, 2023 11:41 pm
by richmond62
The $ [dollar sign] tells the IDE that a letter or word is exactly that, and not a prefix attached to a longer word.

The dictionary explanations re a lot of this stuff is best described as 'gnomic'.

Re: Wild about text searches

Posted: Wed Nov 01, 2023 12:42 am
by stam
Sadly can’t check right now, but I was under the impression matchText uses regex. In regex, $ signifies end of string.

Using $ incorrectly works for your example, because every occurrence of with you or with u is at end of string, but shouldn’t work if it’s in the middle of a string…

Off the top of my head the correct regex is more condensed, with no multiple matchText statements needed, as these are a part of the regex:

Code: Select all

if matchText(PROCK,"(with\s(?:you|u))(?:\s|$)") then…
\s is any whitespace character
(a|b) means either a or b
( ) is a capturing group that is the text your searching for
(?: ) is is a non capturing group, so the parentheses won’t be counted as a find, as it’s a subsection of the string you’re actually searching for

So the find will be wicthin the outer parentheses ( ) -- (with\s(?:you|u))(?:\s|$)"
What it will search for is any occurrence of the word "with" followed by a space, followed by either "you" or "u" and this match must be followed by either a whitespace character (ie space, tab, CR, etc) or by end of string.

So the above should find either “with you” or “with u”, in any part of the text… multiple matchText statements aren’t needed.

Now I say all of this off the top of my head as I’m at work and no way to check… but I think this should work…

Re: Wild about text searches

Posted: Wed Nov 01, 2023 4:36 pm
by richmond62
Indeed you are right,

and this leaves me wondering what the utility of matchText exactly is.

Re: Wild about text searches

Posted: Wed Nov 01, 2023 6:34 pm
by stam
richmond62 wrote:
Wed Nov 01, 2023 4:36 pm
Indeed you are right,

and this leaves me wondering what the utility of matchText exactly is.
it's obviously a mechanism for searching with regex, which is invaluable...

And while regex is hard to learn, it begins to make sense when you've used it a bit.
To experiment with (and learn) regex I strongly recommend using http://regex101.com

This site both allows you to test expressions, has a char-level indicator of what each symbol means and a handy searchable API where you can search for expressions or see what they mean. It's extremely well done...

The power of regex is that you can search for patterns rather than string literals, and while its syntax is difficult to penetrate, there is nothing else like it out there. There is a plethora of resources online with regex examples - almost certainly anything you need will be posted somewhere...
Steep learning curve but worth the effort... and it's pretty much universal, in any programming language.

And needless to say, the poster in the Use-List (which I don't read, I already get far too many emails to maintain sanity) would do well to use this, as the problem described is very simple to solve with regex...