tips
[Top] [All Lists]

REALbasic Tip: Finding whole words in a string

To: "REALbasic Tips" <realbasic-tips at lists dot realsoftware dot com>
Subject: REALbasic Tip: Finding whole words in a string
From: Geoff Perlman <geoff at realsoftware dot com>
Date: Tue, 30 Sep 2003 18:11:50 -0500
Finding a specific word inside a string might seem straightforward enough. After all, the InStr function in REALbasic will find a string within another string. But that's not always good enough. If you need to find a whole word, InStr may not get the job done depending on the contents of the string you are searching. Say you are searching for the word "at" inside the string "Kate is at the store purchasing a bat." The InStr function is going to find "at" in the words "Kate", "at" and "bat". You could search for " at " (with a space on either side of the word to make sure you don't find instances in the middle of another word) but that won't work if the word is followed by punctuation as in "Where are you at?"

Fortunately, there is an easy solution. In REALbasic v3.5 we added support for Regular Expressions. Regular Expressions allow you to manipulate text just about any way you can imagine. They are supported by several programming languages. Here's an example of a function that uses Regular Expressions to perform a search for a whole word within a string. You don't need to learn how Regular Expressions work to use this function. Just copy it into your code and away you go. This function uses a syntax very similar to that of the InStr function. You pass it the source string you want to search through and the string you want to find. It then returns the position of find string within the source string.

Function FindWholeWord(source As string, find As string) As integer
   dim re as regEx
   dim match as regExMatch

   re = new regEx

   re.searchPattern = "(?<!\w)" + find + "(?!\w)"
   match = re.search(source)

   if (match <> nil) then
      'return the one-based string position
      return match.SubExpressionStartB(0) + 1
   else
      // didn't find anything
      return 0
   end if
End Function

Unlike InStr, this function will only return the first occurrence of the word. However, this could easily be modified to accept a starting position as a parameter. Also, this method will only work for words. Punctuation for example, could cause it to return the wrong results because some special characters would be recognized as Regular Expression commands. For the purposes of this example, we've kept it simple. Next week, I'll show you the same function improved to support any type of string.

The syntax for search patterns is a bit cryptic. However, once you start using it, you can do some pretty amazing things. If you'd like to learn more about Regular Expressions, Read "Searching using Regular Expressions" in the User's Guide and check out the RegEx class in the Language Reference.

This tip was inspired by Cortis Clark.

--
Geoff Perlman
President and CEO
REAL Software, Inc.
512-328-7325 x711 (voice)
512-328-7372 (fax)


- - - - - - - - - -
Got a useful tip to share? Send it to us at:
<REALbasic-tips at lists dot realsoftware dot com>

Click here to unsubscribe:
<http://support.realsoftware.com/listmanager/>

<Prev in Thread] Current Thread [Next in Thread>
  • REALbasic Tip: Finding whole words in a string, Geoff Perlman <=