Learn Regex Shortly

REGEX




  • Hello Guys! This blog will be short and easy to understand about REGEX.
  • most of the time it becomes very difficult to get exact expression to give validation in our program files but it could be very easy if you understand REGEX well.
  • Below are some regex patterns are given and described shortly to keep this article short and not lengthy.
  • '.' Symbol :- . (dot symbol) is a special character which is used to select all.
      • If you wanted to search for . using regex you will need to escape that character by using backslash ‘\’.

      • Ex:-    ‘\.’  this regex will search for . in your file.

      • If it matches all characters except new line.

  • Meta characters :-   all special symbols.
      • You can find or validate any special symbol by escape it otherwise they will function as per their define meaning.

  • \d 
      • It matches all digits 0-9 in your string or file.

  • \D
      • It matches everything in file except digits.

  • \w 
      • It matches all character in given set ( a-z, A-Z, 0-9, _)

  • \W 
      • It matches everything except given set of values  ( a-z, A-Z, 0-9, _)

  • \s  
      • It matches (tabs , whitespaces, newline )

  • \S 
      • It matches everything except (tabs , whitespaces, newline )




    Anchors


     

  • These are sets of regex patterns which are not used to match characters, instead they are used to match invisible positions of characters in string or in a set of statements.
  • \b  ,  \B,   ^,   $.
  • \b 
      • It is used match word boundary of any word or character in a word or statement.

      • Ex:- Aa  AaAa

      • It will match boundary around word your searching for in second match Aa was not selected because it is attached to word there is no boundary between them.

  • \B 
      • It matches input string which does not have boundary. In our case second Aa will be the match if we use this.


  • Ex:- \b Aa AaAa \b
      • This will result like Aa AaAa

      • Because it state that Aa with boundary at the beginning and ending.


  • ^
      • It matches position of the string at beginning.

      • Ex:-    Ha HAHA

      Applying regex :- ^Ha HAHA


    • Here in above example it give result as  Ha HAHA

    • Only beginning Ha has been selected just because it comes at beginning if try to to get this done for inner element it wont work.

    • It is used to find string at beginning . 

    • It only worked for beginning element of a line.


  • $
      • Similar way to carrot symbol ^ this $ sign works but only from ending portion or word will get checked when you use $ at the end of string.

      • It determines which thing should be match at end.

      • Ex:-  Ha HAHA$

      • o/p:- Ha HAHA

      • It will find highlighted part because it is the last part of string.

      • $ it is mostly used to find string position at the end if it is there.





  • Note:-  All regex character finds one character if you use it once.
      • Ex:- \d will apply to only one digit and will find all individual digit character available in file. 

      • For doing literal search you will need to use \d multiple times how much digits you wanted to have in one literal or word/string.





Character Set


  • []
      • This is character set regex which is use to specify only things you want to use in your string.

      • Ex:- \d-\d

      • In above example we have used - but it may possible in next time you may use other symbol and there will be so on but if you want specific set of symbols or any character then you can do it using character set [].

      • Ex:- \d[-_]\d

      • It will match match only given two symbol within string.

      • In character set we don't have to use escape character \ for ‘.’ dot.

      • We can pass alphanumeric values inside [].

      • If we use ^ in [] then it will negate the set and will match everything present in set instead of its usual function.

      • Ex:- [^a-z]

      • If i do like above it will not match anything in lower case.


  • ()  
      • This is known as Group.

      • It is used to group and specify characters you want.

      • Ex: \w(a|b|c|)

      • In above example you can use a or b or c whenever \w will check for character.


  • |
      • It is ‘or’ .

      • It allows you to provide option from multiple character.

      • We can use it in a group.





    Quantifier



      • It matches 0 or more.

      • Ex:- Tat

      • In above ex. To match above lateral we can use * as :-   \w*

      • It will match from 0th position which of character T till end of string which is t.


  • +
      • One or more.

      • Ex:- T

      • In above ex. To match above lateral we can use * as :-   \w*

      • It will not match any character because after T there is no character and + always look for one or more.

      • It will work and match 


  • ?
      • 0 or one.


  • {5}
      • Exact numbers


  • {1,5} 
      • Range of numbers {minimum, maximum}


    • We can use quantifiers to show how many time i want to repeat a regex character.

    • Ex:- \d\d\d\d

    • In the above example we wrote \d 4 times means 4 digit.

    • We can simplify this using quantifiers like this :-    \d{4}

    • By writing this it will work the same as it was working for the initial example. 

    • This way we can shorten our regex.

Comments

Popular Posts