Read + Write + Report
Home | Start a blog | About Orble | FAQ | Sites | Writers | Advertise | My Orble | Login

Finding a string value in a SQL Server table

January 1st 2009 12:50
Finding a string value in a SQL Server table


Problem
A SQL Server/Web Application Developer in my company approached me yesterday with a dilemma. He needed to be able to find a phrase within a specific field in a SQL Server database he was developing. If he received a "hit" then the process was to return the related value in a field within a parent table. This is a typically simple process that can be handled with the use of the T-SQL CHARINDEX() function, however there were certain caveats in his requirements that would not allow CHARINDEX() to be the complete solution to his problem. He needed to only consider situations where the value of the field started with the phrase or if the phrase was the beginning of a word in the value of the search field it was a valid hit. Read on and I'll explain further.


Solution
The web application being developed was for a medical website under which a user could select a phrase and search on various symptoms to return a related record in a table for possible malady. A phrase in this case could be multiple words or a single letter. Therefore if you selected your search phrase to be "B" and there were values in the search field of "Bone Mass", "Thumb Pain", or "Tumor, Benign" the results would be a positive hit on "Bone Mass" (because the value starts with the letter B) and "Tumor, Benign" (because it is preceded by a space). Even though the word "Thumb" includes the letter B in the search criteria it is omitted because it is, essentially, not the first letter of a word in the value.

Let's dismiss the medical jargon for a second and take a look at an example where I relate the content back to SQL Server. For the sake of this example I will go on the basis that we have two tables: ##ProductChild (with values pertaining to relational database management systems) and ##ProductParent, which will be used to store just two records that state whether a given record in ##ProductChild is a SQL Server database system or not. We will use "SQL" as our search criteria.


--Create Product Table and enter in sample values
CREATE TABLE ##ProductChild (ID INT, ChildDescription VARCHAR(100), FKID INT)

INSERT INTO ##ProductChild (ID, ChildDescription, FKID)
VALUES (1, 'SQL Server', 10)

INSERT INTO ##ProductChild (ID, ChildDescription, FKID)
VALUES (2, 'MySQL', 20)

INSERT INTO ##ProductChild (ID, ChildDescription, FKID)
VALUES (3, 'Oracle', 30)

INSERT INTO ##ProductChild (ID, ChildDescription, FKID)
VALUES (4, 'SQL Server 2000', 10)

INSERT INTO ##ProductChild (ID, ChildDescription, FKID)
VALUES (4, 'SQL Server 2005', 10)

INSERT INTO ##ProductChild (ID, ChildDescription, FKID)
VALUES (4, 'SQL Server 2008', 10)

INSERT INTO ##ProductChild (ID, ChildDescription, FKID)
VALUES (5, 'Enterprise Edition SQL Server 2008', 10)

--Create Parent SQL Server Yes/No table and enter in sample values
CREATE TABLE ##ProductParent (ID INT, Item VARCHAR(100))

INSERT INTO ##ProductParent (ID, Item)
VALUES (10, 'SQL Server')

INSERT INTO ##ProductParent (ID, Item)
VALUES (20, 'Non-SQL Server')

At this point if we query each table we will receive the following results:

SELECT ID, ChildDescription, FKID
FROM ##ProductChild
ORDER BY ID

ID Child Description FKID
1 sql server 10
2 my sql 20 3 oracle 30
4 sql server 2000 10
4 sql server 2005 10
4 sql server 2008 10
5 enterprise edition sql server 2008 10

SELECT ID, Item
FROM ##ProductParent
ORDER BY ID

ID Item
10 sql server
20 non-sql server

There are two different approaches that can be taken to perform this query from this point forward. I will highlight both options and note now that the actual execution plan and execution times were identical.

Option 1: Use the CHARINDEX() Function

Now, using these sample tables we will search for all records in ##ProductChild where ChildDescription either starts with "SQL" or contains a word that starts with "SQL". We will ignore any results that do not meet this criteria, even if "SQL" appears in the field within a given word. I'll do so by breaking a rule I try to live by: no functions in the WHERE clause. These tables are extremely small. In the real-world scenario that prompted this article, we are also not dealing with more than a couple thousand records either.

--Declare necessary variables
DECLARE @searchstring VARCHAR(20)
DECLARE @lensearch smallint

--Set values for search
SELECT @searchstring = 'SQL'
SELECT @lensearch = LEN(@searchstring)

--Perform Search
SELECT DISTINCT P.ID, P.Item
FROM ##ProductParent P INNER JOIN ##ProductChild C ON P.ID = C.FKID
WHERE
C.ChildDescription LIKE @searchstring '%'
OR
CHARINDEX(' ' @searchstring, C.ChildDescription, 1) > 0
ORDER BY P.ID

DROP TABLE ##ProductChild
DROP TABLE ##ProductParent



Taking a closer look at this code it is extremely straight-forward compared to the ideas he and I tossed about: cursors, loops, all sorts of madness. We end up breaking the two positive hit possibilities into two parts of an OR statement:

If ##ProductChild.ChildDescripti on starts with "SQL": C.ChildDescription LIKE @searchstring
If ##ProductChild.ChildDescripti on contains a word that starts with "SQL": ('' @searchstring , C.ChildDescription , 1) > 0
CHARINDEX() is the T-SQL function used to search for a given value within a field. It accepts three parameters:

Search Value
Field to Search
Starting Position in Field
In the sample code I append a space onto the beginning of the search string so as to force the CHARINDEX function to ignore any hits within words in the ChildDescription table.

Option 2: Use the LIKE Condition

Instead of using CHARINDEX() to account for "hits" where the search string is preceded by a space we can instead simply append a space and '%' wildcard before the @searchstring variable as shown below. Ultimately this is probably the better solution as the format of the programming is consistent between criteria being searched upon. There is also one less variable and therefore less overhead involved in this code block. Simply put, the code is more intuitive from a readability standpoint.

--Declare necessary variables
DECLARE @searchstring VARCHAR(20)

--Set values for search
SELECT @searchstring = 'SQL'

--Perform Search
SELECT DISTINCT P.ID, P.Item
FROM ##ProductParent P INNER JOIN ##ProductChild C ON P.ID = C.FKID
WHERE
C.ChildDescription LIKE @searchstring '%'
OR
C.ChildDescription LIKE '% ' @searchstring '%'
ORDER BY P.ID

DROP TABLE ##ProductChild
DROP TABLE ##ProductParent



End Result:

Regardless of which option I use, the results are identical. I get valid hits on 4 records in the ##ProductChild.ChildDescripti on and therefore receive these results for the query:

ID Item
10 sql server



If I was to search based upon "My" I'd receive the following results, due to the fact that MySQL is not a Microsoft SQL Server product.


ID Item
20 non-sql server


Of course the proper process is to encapsulate this into a stored procedure. Using what we just learned this is the final result:

CREATE PROCEDURE uspSearchWord @searchstring VARCHAR(100) AS
SELECT DISTINCT P.ID, P.Item
FROM ##ProductParent P INNER JOIN ##ProductChild C ON P.ID = C.FKID
WHERE
C.ChildDescription LIKE @searchstring '%'
OR
C.ChildDescription LIKE '% ' @searchstring '%'
ORDER BY P.ID




43
Vote
Add To: del.icio.us Digg Furl Spurl.net StumbleUpon Yahoo


   
Subscribe to this blog 


Just this blog This blog and DailyOrble (recommended)

   

   

   


Add A Comment

To create a fully formatted comment please click here.


CLICK HERE TO LOGIN | CLICK HERE TO REGISTER

Name or Orble Tag
Home Page (optional)
Comments
Bold Italic Underline Strikethrough Separator Left Center Right Separator Quote Insert Link Insert Email
Notify me of replies
Notify extra people about this comment
Is this a private comment?
List the Email Addresses or Orble Tags of the people you would like to be notified about this comment


One per line max of 30

List the Email Addresses or Orble Tags of the people you would like to be notified about this private comment thread. Only the people in this list will be able to see or reply to your comment.


One per line max of 30

Your Name
(for the email going out to the above list, it can be different to your Orble Tag)
Your Email Address
(optional)
(required for reply notification)
Submit
More Posts
4 Posts
11 Posts
1 Posts
77 Posts dating from May 2008
Email Subscription
Receive e-mail notifications of new posts on this blog:
0

Siddharth sood's Blogs

36 Vote(s)
0 Comment(s)
1 Post(s)
82 Vote(s)
0 Comment(s)
2 Post(s)
37 Vote(s)
0 Comment(s)
1 Post(s)
49 Vote(s)
0 Comment(s)
1 Post(s)
29 Vote(s)
0 Comment(s)
1 Post(s)
75 Vote(s)
0 Comment(s)
2 Post(s)
165 Vote(s)
0 Comment(s)
4 Post(s)
207 Vote(s)
2 Comment(s)
5 Post(s)
25 Vote(s)
0 Comment(s)
1 Post(s)
24 Vote(s)
0 Comment(s)
1 Post(s)
Moderated by Siddharth sood
Copyright © 2006 2007 2008 On Topic Media PTY LTD. All Rights Reserved. Design by Vimu.com.
On Topic Media ZPages: Sydney |  Melbourne |  Brisbane |  London |  Birmingham |  Leeds     [ Advertise ] [ Contact Us ] [ Privacy Policy ]