710 likes | 881 Views
Ruby and the tools 740Tools03RubyRegExpr. CSCE 740 Software Engineering. Topics Ruby Reg Expressions. January 22, 2014. Ruby http://www.ruby-doc.org / Starting-up http://ruby.about.com/od/tutorialsontheweb/tp/10waysfree.htm R ails and Beyond
E N D
Ruby and the tools740Tools03RubyRegExpr CSCE 740 Software Engineering • Topics • Ruby Reg Expressions January 22, 2014
Ruby • http://www.ruby-doc.org/ • Starting-up • http://ruby.about.com/od/tutorialsontheweb/tp/10waysfree.htm • Rails and Beyond • http://ruby.railstutorial.org/ruby-on-rails-tutorial-book
Ruby.new • Really Object Oriented – everything is an object • For Java you might use pos = Math.abs(num) • In Ruby num = -123 pos = num.abs • Variables inside literal strings #{ … } notation • puts “the absolute value of #{num} is #{pos}\n” • Variable Name Punctuation • name - local variable • $name, $NAME, - globals (never use anyway!) • @name – instance variables • @@name – class variables • Name – class names and constants Prog Ruby 1.9 Dave Thomas
Puts_examples • song = 1 • sam = "" • defsam.play(a) • "duh dum, da dum de dum ..." • end • puts "gin joint".length • puts "Rick".index("c") • puts 42.even? • puts sam.play(song) • print “string with no newline” Programming Ruby 1.9 Dave Thomas
Method definition • defsay_goodnight(name) • result = "Good night, " + name • return result • end • # Time for bed... • puts say_goodnight("John-Boy") • puts say_goodnight("Mary-Ellen") Programming Ruby 1.9 Dave Thomas
CSE Linux Labs • 1D39 – Combination on secure site • CSE home page – login with CEC credentials • Computer resources/Computer labs • List of Linux workstations • IP addresses – machine names • Ifconfig // interface config IPv4 addr and Hardware=ethernetaddr • “man keyword” online unix documentation
List of Ruby 3 Examples • Getting Started – “hello, Ruby Programmer” • Intro – • Hello1 – defsay_goodnight • Puts examples • Cmd_line – command line args passed into ruby program • “arrays” – non-homogeneous • hash_with_symbol_keys_19.rb – • Weekdays – control structures – if-elseif-else example • Tutclasses-
google(ruby 1.9 tutorial) • Ruby Programming Language - ruby-lang.org/ • http://www.ruby-doc.org/ (again) • http://pragprog.com/titles/ruby3/source_code (again) • http://pragprog.com/book/ruby3/programming-ruby-1-9 “Buy the book” page • Regular Expressions (download pdf) • Namespaces, Source Files, and Distribution (download pdf) • Ruby Library Reference • Built-in Classes and Modules (download pdf of the entry for class Array) • Standard Library • http://www.ruby-doc.org/stdlib-1.9.3/
defsayGoodnight(name) • result = "Goodnight, " + name • return result • end • # Time for bed... • puts sayGoodnight("John-Boy") • puts sayGoodnight("Mary-Ellen") Programming Ruby 1.9 Dave Thomas
Simplifying • defsayGoodnight(name) • result = "Goodnight, #{name}“ • return result • end • #Simplifying further eliminating the return statement • defsayGoodnight(name) • result = "Goodnight, #{name}“ • end Programming Ruby 1.9 Dave Thomas
instSection['bassoon'] Arrays and Hashes • a = [ 1, 'cat', 3.14 ] # array with three elements • a[0] >> 1 • a[2] = nil • #dump array • puts a >> [1, "cat", nil] • empty1 = [] • empty2 = Array.new • a = %w{ ant bee cat dog elk } Programming Ruby 1.9 Dave Thomas
Hashes • instSection = { • 'cello' => 'string', • 'clarinet' => 'woodwind', • 'drum' => 'percussion', • 'oboe' => 'woodwind', • 'trumpet' => 'brass', • 'violin' => 'string’ • } • In the online text “>>” means evaluates to • instSection['oboe'] >> • instSection['cello'] >> • instSection['bassoon'] >> Programming Ruby 1.9 Dave Thomas
histogram = Hash.new(0) • histogram['key1'] = histogram['key1'] + 1 Programming Ruby 1.9 Dave Thomas
Control Structures IF • if count > 10 // body • puts "Try again" • elsif tries == 3 • puts "You lose" • else • puts "Enter a number" • end http://ruby-doc.org/docs/ProgrammingRuby/
Control Structures While • while weight < 100 and numPallets <= 30 • pallet = nextPallet() • weight += pallet.weight • numPallets += 1 • end http://ruby-doc.org/docs/ProgrammingRuby/
More control • puts "Danger, Will Robinson" if radiation > 3000 • while square < 1000 • square = square*square • end • More concisely, but readability?? • square = square*square while square < 1000 http://ruby-doc.org/docs/ProgrammingRuby/
Regular Expressions • Regular expressions are expressions that specify collections of strings (formally languages) • Main operators: Assume r and s are regular expressions then so are: • r | s alternation denotes L(r) U L(s) • rs concatenation which denotes L(r) L(s) • r* Kleene closure zero or more occurrences of strings from L(r) concatenated • Base regular expressions • strings are regexpr that match themselves,L(“ab”) = {“ab”} • the empty string ε is a regular expr L(ε) = {“”}
Ruby Regular Expressions http://rubular.com/
Ruby Regular Expressions http://rubular.com/
Quoting • First, always remember that you need to escape any of these characters with a backslash if you want them to be treated as regular characters to match: • show_regexp('yes | no', /\|/) • # => yes ->|<- no • show_regexp('yes (no)', /\(no\)/) • # => yes ->(no)<- • show_regexp('are you sure?', /e\?/) • # => are you sur->e?<- http://ruby-doc.org/docs/ProgrammingRuby/
Matching Strings with Patterns • Ruby operator =~ matches a string against a pattern. It returns the character offset into the string at which the match occurred: • /cat/ =~ "dog and cat" # => 8 • /cat/ =~ "catch" # => 0 • /cat/ =~ "Cat" # => nil • You can put the string first if you prefer:2 • "dog and cat" =~ /cat/ # => 8 • "catch" =~ /cat/ # => 0 • "Cat" =~ /cat/ # => nil http://ruby-doc.org/docs/ProgrammingRuby/
Regular Expressions Examples http://ruby-doc.org/docs/ProgrammingRuby/
Using Match results in Control Flow • str = "cat and dog" • if str =~ /cat/ • puts "There's a cat here somewhere" • end • testing when a pattern does not match a string using !~: • File.foreach("testfile").with_index do |line, index| • puts "#{index}: #{line}" if line !~ /on/ • end • produces: • 1: This is line two • 2: This is line three http://ruby-doc.org/docs/ProgrammingRuby/
Changing Strings with Patterns • str = "Dog and Cat" • new_str1 = str.sub(/a/, "*") • new_str2 = str.gsub(/a/, "*") • puts "Using sub: #{new_str1}" • puts "Using gsub: #{new_str2}“ • produces: • Using sub: Dog *nd Cat • Using gsub: Dog *nd C*t http://ruby-doc.org/docs/ProgrammingRuby/
Grouping • You can use parentheses to group terms within a regular expression. Everything within the group is treated as a single regular expression. • # This matches an 'a' followed by one or more 'n's • show_regexp('banana', /an+/) # => b->an<-ana • # This matches the sequence 'an' one or more times • show_regexp('banana', /(an)+/) # => b->anan<-a
Grouping to save portions that match • /(\d\d):(\d\d)(..)/ =~ "12:50am" # => 0 • "Hour is #$1, minute #$2" # => "Hour is 12, minute 50" • /((\d\d):(\d\d))(..)/ =~ "12:50am" # => 0 • "Time is #$1" # => "Time is 12:50" • "Hour is #$2, minute #$3" # => "Hour is 12, minute 50" • "AM/PM is #$4" # => "AM/PM is am"
show_regexp • defshow_regexp(string, pattern) • match = pattern.match(string) • if match • "#{match.pre_match}->#{match[0]}<-#{match.post_match}" • else • "no match" • end • end http://ruby-doc.org/docs/ProgrammingRuby/
show_regexp Examples • show_regexp('very interesting', /t/) • # => very in->t<-eresting • show_regexp('Fats Waller', /a/) • # => F->a<-tsWaller • show_regexp('Fats Waller', /lle/) • # => Fats Wa->lle<-r • show_regexp('Fats Waller', /z/) • # => no match http://ruby-doc.org/docs/ProgrammingRuby/
Anchors http://ruby-doc.org/docs/ProgrammingRuby/
Anchor Regular Expressions Examples • str = "this is\nthe time" • show_regexp(str, /^the/) • # => this is\n->the<- time • show_regexp(str, /is$/) • # => this ->is<-\nthe time • show_regexp(str, /\Athis/) • # => ->this<- is\nthe time • show_regexp(str, /\Athe/) # => no match • show_regexp("this is\nthe time", /\bis/) • # => this ->is<-\nthe time • show_regexp("this is\nthe time", /\Bis/) • # => th->is<- is\nthe time http://ruby-doc.org/docs/ProgrammingRuby/
Character classes • character class - [characters] matches any single character between the brackets • For example • [aeiou] • [cat] = [tac] • negated class - [^xyz] matches any single character not x, y or z • show_regexp('Price $12.', /[aeiou]/) # => Pr->i<-ce $12. • show_regexp('Price $12.', /[\s]/) # => Price-> <-$12. • show_regexp('Price $12.', /[$.]/) # => Price ->$<-12. http://ruby-doc.org/docs/ProgrammingRuby/
Character class Examples • a = 'see [The PickAxe-page 123]' • show_regexp(a, /[A-F]/) • # => see [The Pick->A<-xe-page 123] • show_regexp(a, /[A-Fa-f]/) • # => s->e<-e [The PickAxe-page 123] • show_regexp(a, /[0-9]/) • # => see [The PickAxe-page ->1<-23] • show_regexp(a, /[0-9][0-9]/) • # => see [The PickAxe-page ->12<-3] http://ruby-doc.org/docs/ProgrammingRuby/
Character class abbreviations http://ruby-doc.org/docs/ProgrammingRuby/
Ruby Regular Expr Changing text • if line =~ /Perl|Python/ • puts "Scripting language mentioned: #{line}” • end • line.sub(/Perl/, 'Ruby') # replace first 'Perl' with 'Ruby‘ • line.gsub(/Python/, 'Ruby') # replace every 'Python' with 'Ruby' http://ruby-doc.org/docs/ProgrammingRuby/
Regexp Constructor • Regexps are created using the • /.../ and • %r{...} literals, “r” for raw don’t cook the ‘/’ and by the • Regexp::new constructor.
the %r{...} syntax • The %r syntax is particularly useful when creating patterns that contain forward slashes: • %r{} is equivalent to the /.../ notation, but allows you to have '/' in your regexp without having to escape them: • %r{/home/user} is equivalent to /\\/home\\/user/ • /mm\/dd/ # => /mm\/dd/ • Regexp.new("mm/dd") # => /mm\/dd/ • %r{mm/dd} # => /mm\/dd/
Regular Expressions Examples • date = "12/25/2010" • date =~ %r{(\d+)(/|:)(\d+)(/|:)(\d+)} • [$1,$2,$3,$4,$5] # => ["12", "/", "25", "/", "2010"] • date =~ %r{(\d+)(?:/|:)(\d+)(?:/|:)(\d+)} • [$1,$2,$3] # => ["12", "25", "2010"] http://ruby-doc.org/docs/ProgrammingRuby/
Backslash Sequences in Substitutions • puts "fred:smith".sub(/(\w+):(\w+)/, '\2, \1') • puts "nercpyitno".gsub(/(.)(.)/, '\2\1') • produces: • smith, fred • encryption • More use of backslashes • \& (last match), • \+ (last matched group), • \‘ (string prior to match), • \’ (string after match), and • \\ (a literal backslash).
str = 'a\b\c' # => "a\b\c" • str.gsub(/\\/, '\\\\\\\\') # => "a\\b\\c“ • However, using the fact that \& is replaced by the matched string, you could also write this: • str = 'a\b\c' # => "a\b\c" • str.gsub(/\\/, '\&\&') # => "a\\b\\c" • If you use the block form of gsub, the string for substitution is analyzed only once : • str = 'a\b\c' # => "a\b\c" • str.gsub(/\\/) { '\\\\' } # => "a\\b\\c"
Backslashes in Patterns • same = "12:15-12:45" • differ = "12:45-13:15“ • # use numbered backreference • same =~ /(\d\d):\d\d-\1:\d\d/ # => 0 • differ =~ /(\d\d):\d\d-\1:\d\d/ # => nil
Cases • /pat/i # i for ignore case • /\.(gif|jpg|jpeg|png)$/i • defmixed_case(name) • name.downcase.gsub(/\b\w/) {|first| first.upcase } • end • mixed_case("DAVE THOMAS") # => "Dave Thomas" • mixed_case("davethomas") # => "Dave Thomas" • mixed_case("dAvEtHoMas") # => "Dave Thomas"
Rubular • Rubular: a Ruby regular expression editor and tester http://rubular.com/
(?<month>\d{1,2})\/(?<day>\d{1,2})\/(?<year>\d{4}) • Today's date is: 1/21/2014.
Regular Expression Options • iCase insensitive. The pattern match will ignore the case of letters in the pattern and string. • o Substitute once. Any #{...} substitutions in a particular regular expression literal will be performed just once, the first time it is evaluated. Otherwise, the substitutions will be performed every time the literal generates a Regexp object. • m Multiline mode. Normally, “.” matches any character except a newline.With the /m option, “.” matches any character. • x Extended mode. Complex regular expressions can be difficult to read. The x option allows you to insert spaces and newlines in the pattern to make it more readable. You can also use # to introduce comments.
Blocks • a = %w( ant bee cat dog elk ) # create an array • a.each { |animal| puts animal } # iterate over the contents • Yield – will be discussed next time • [ 'cat', 'dog', 'horse' ].each do |animal| • print animal, " -- " http://ruby-doc.org/docs/ProgrammingRuby/
{ puts "Hello" } # this is a block • do # • club.enroll(person) # and so is this • person.socialize • end http://ruby-doc.org/docs/ProgrammingRuby/
Blocks • 5.times { print "*" } • 3.upto(6) {|i| print i } • ('a'..'e').each {|char| print char } • *****3456abcde http://ruby-doc.org/docs/ProgrammingRuby/