Entering data from a specific format

Let's say I have the following line: "Algorithms 1" by Robert Sedgewick

. This is entered from the terminal.

The format of this string will always be:
1. Starts with a double quote
2. A sequence of characters (may contain a space)
3. Following a double quote
4. Spaces followed
5. The next word "by"
6. Spaces followed
7. Characters follow (may contain space)

Knowing the above format, how can I read this?

I tried to use fmt.Scanf()

but this will treat the word after each space as a separate value. I looked at regexes, but I couldn't figure out if there was a function using the values ​​I could get rather than just validating.

+3


source to share


2 answers


You should use groups (parentheses) to get the information you need:

"([\w\s]*)"\sby\s([\w\s]+)\.

      

This returns two groups:

  • [1-13] Algorithms 1

  • [18-34] Robert Sedgewick



There should now be a regex method to get all the matches from the text. The result will contain a match object, which then contains the groups.

I think in Go: FindAllStringSubmatch ( https://github.com/StefanSchroeder/Golang-Regex-Tutorial/blob/master/01-chapter2.markdown )

Test it here: https://regex101.com/r/cT2sC5/1

+5


source


1) When searching for a character

The input format is so simple that you can simply search for characters in strings.IndexRune()

:

s := `"Algorithms 1" by Robert Sedgewick`

s = s[1:]                      // Exclude first double qote
x := strings.IndexRune(s, '"') // Find the 2nd double quote
title := s[:x]                 // Title is between the 2 double qotes
author := s[x+5:]              // Which is followed by " by ", exclude that, rest is author

      

Printing results with:

fmt.Println("Title:", title)
fmt.Println("Author:", author)

      

Output:

Title: Algorithms 1
Author: Robert Sedgewick

      

Try it on the Go Playground .

2) When splitting

Another solution would be to use strings.Split()

:

s := `"Algorithms 1" by Robert Sedgewick`

parts := strings.Split(s, `"`)
title := parts[1]      // First part is empty, 2nd is title
author := parts[2][4:] // 3rd is author, but cut off " by "

      

The output is the same. Try it on the Go Playground .



3) With "complex" splitting

If we strip off the first double quote, we can do the splitting to the delimiter

`" by `

      

If we do this, we will have exactly two parts: title and author. Since we have disabled the first double quote, the separator can only appear at the end of the title (the title cannot contain double quotes according to your rules):

s := `"Algorithms 1" by Robert Sedgewick`

parts := strings.Split(s[1:], `" by `)
title := parts[0]  // First part is exactly the title
author := parts[1] // 2nd part is exactly the author

      

Try it on the Go Playground .

4) With regexp

If after all the above solutions, you still want to use regexp, here's how you could do it:

Use parentheses to identify the submatches you want to receive. You want 2 parts: the title between the quotes and the author that follows by

. You can use regexp.FindStringSubmatch()

to get the matching parts. Note that the first element in the returned slice will be the full input, so the relevant parts are the following:

s := `"Algorithms 1" by Robert Sedgewick`

r := regexp.MustCompile(`"([^"]*)" by (.*)`)
parts := r.FindStringSubmatch(s)
title := parts[1]  // First part is always the complete input, 2nd part is the title
author := parts[2] // 3rd part is exactly the author

      

Try it on the Go Playground .

+6


source







All Articles