Parsing urls using C strings in C ++
I am learning C ++ for one of my CS classes and for our first project I need to parse some urls using c-strings (i.e. I cannot use the C ++ String class).
The only way I can think of approaching this is to just iterate (since it's char []) and use some switch statements. From someone more experienced in C ++ - is there a better approach? Could you point me to a good online resource? I haven't found it yet.
source to share
It's strange that you are not allowed to use the features of the C ++ language, that is, C ++ strings!
The C standard library has some C string functions.
eg.
strdup - duplicate a string
strtok - breaking a string into tokens. Beware - this modifies the original string.
strcpy - copying string
strstr - find string in string
strncpy - copy up to n bytes of string
etc
There is a good online link here with a complete list of available c string functions for finding and finding things.
http://www.cplusplus.com/reference/clibrary/cstring/
You can step through the strings by accessing them like an array if you need to.
eg.
char* url="http://stackoverflow.com/questions/1370870/c-strings-in-c"
int len = strlen(url);
for (int i = 0; i < len; ++i){
std::cout << url[i];
}
std::cout << endl;
As for actually how to do the parsing, you have to work on your own. This is the destination after all.
source to share
There are a number of standard C library functions that can help you.
First, let's look at the standard C library function strtok . This allows you to extract parts of a C string separated by specific delimiters. For example, you can do / delimited tokenization to get the protocol, domain, and then the file path. You can spoof a domain with a delimiter. get subdomain (s), second level domain and top level domain. Etc.
It's not nearly as powerful as the regex parser you really need to parse URLs, but it works on C strings, is part of the C standard library, and is probably suitable for your intended use.
Other C library functions that might help:
- strstr () Extracts substrings just like std :: string :: substr ()
- strspn (), strchr (), and strpbrk () Find a character or characters in a string similar to std :: string :: find_first_of () etc.
Edit: A reminder that the correct way to use these functions in C ++ is to include <cstring>
and use them in the std :: namespace for example. std :: strtok ().
source to share
I don't know what the requirements are for parsing urls, but if it is a CS level it would make sense to use (very simple) BNF and (very simple) recursive descent parser.
This will provide a more robust solution than direct iteration, for example. for bad urls.
Very few string functions from the C standard library are needed.
source to share