How to identify a given keyword out of 1000 potential words?

So, I don't have a professional developer, but I do programming from time to time. I want to write code and look for guidance on how to control a parser that reads a text file by looking at each line as a string and trying to detect the input on that line. Any string can be one of over 1000 different keywords, which is the tricky part. Once I have this keyword, I feel like there should be a significantly more efficient method of defining what it is, rather than implementing 1000 if-else statements or 1000 case-break statements. Once I have matched the given keyword, I plan to move on to a generic routine that instantiates an object of this type. I don't want to do 999 tests before finding my target, it's just scraps that I like.I tried to break it down alphabetically, which makes it much smaller, but there is still an unmanageably large number of if-else statements.

I've already figured out that I can't nest more than 128 if-else statements, so my current alternative is 1000s of only "if" statements without matching "else", which I know is bad practice. So here's a summary of my current code:

if (keyword_str.compare(keyword1)) {
        Parse(keyword1); // A general routine to parse all these similarly formatted keywords
        } 
if (keyword_str.compare(keyword2)) {
        Parse(keyword2);
        } 
if (keyword_str.compare(keyword3)) {
        Parse(keyword3);
        }
//
//
//

if (keyword_str.compare(keyword999)) {
        Parse(keyword999);
        }
if (keyword_str.compare(keyword1000)) {
        Parse(keyword1000);
        }

      

Any help would be greatly appreciated! Thank!


Okay, so, here's the point I'm at but still getting lost on how to use a map to determine the type of an object and then instantiate that object. Here are some code snippets:

class baseClass
    {
    public:
        baseClass();
        ~baseClass();
    };
//
// keyword1 class declaration
class keyword1 : public baseClass
    {
    public:
        // Constructors
        keyword1 () { cout << "keyword1 constructor..." << endl;}
        ~keyword1 () { cout << "keyword1 destructor..." << endl;}

    protected:

    };
//
// keyword2 class declaration
class keyword2 : public baseClass
    {
    public:
        // Constructors
        keyword2 () { cout << "keyword2 constructor..." << endl;}
        ~keyword2 () { cout << "keyword2 destructor..." << endl;}

    protected:

    };
//
// keyword3 class declaration
class keyword3 : public baseClass
    {
    public:
        // Constructors
        keyword3 () { cout << "keyword3 constructor..." << endl;}
        ~keyword3 () { cout << "keyword3 destructor..." << endl;}

    protected:

    };


//
//*******************


    map <string, baseClass> keyword_map;

    keyword_map.insert (make_pair ("keyword1", keyword1 )); // ########## This is where I'm lost
    keyword_map.insert (make_pair ("keyword2", keyword2 )); // ########## This is where I'm lost
    keyword_map.insert (make_pair ("keyword3", keyword3 )); // ########## This is where I'm lost

    // Search for keyword
    string searching_for = "keyword3";
    map <string, baseClass> ::const_iterator it = keyword_map.find(searching_for);


    if (it == keyword_map.end()) {
        cout << "No keyword found." << endl;
            }
        else 
            {
        cout << "Found the keyword!" << endl;
        it->second; // ########## This is where I'm lost
            }

      

+3


source to share


3 answers


Once I have matched the given keyword, I plan to move on to a generic routine that instantiates an object of this type.

Your intuition that you don't want 1000 different IF statements is correct.

In particular, I would suggest thinking about how the old school map directory works (if you've ever seen one, help young people understand what it is?)

enter image description here

Card catalogs are useful because you don't start with the first drawer and go through all the items in sequence and then move on to the next drawer. Instead, you have a quick test that you can use to find out which box to search for. This quick test includes a fingerprint or hash candidate. Old library card catalogs often use a very simple "hash function" (first or two letters), this drawer contains all the cards for books whose names start with "S-Ti"). You reduce the number of comparisons you need to make on the basis of this test, so that you look in only one box.

If it sounds like a lot of work to come up with a way to fingerprint and record them in buckets like this, you're in luck. All of this is already done for you under the hood of the standard library. Unless your needs are very specialized (or your keywords have odd patterns in them, where they all have the same "fingerprint") ... std :: unordered_map should work.

Select a "key" that is std::string

representing your keywords. The "value" will be a factory of some kind ... a way to create your objects from the stuff following the keyword. This could be the basis of your "best way" ...

.. BUT

In terms of initialization std::unordered_map

to fulfill your bets in this case, if your "values" in the map are for constructing another class, 1000 is a lot of classes. You might want to lay out more details before you type class ObjectOne

and number them before class ObjectOneThousand

, which sounds every bit as dubious as doing 1000 IF statements for comparison.

Therefore, you might want to look more in the chat or on some other forum before continuing with this idea.




UPDATE for edit response

There is a problem in your code in terms of keyword classes. Are they meant to represent a keyword class (as in ... only exists so that many of them were ever created when you have keywords?). You should be skeptical about a class that has only one instance and is a class of things; what's for this class. If it makes sense.: - /

So what you want to put on your map are not instances of the keyword. This is more than you conceptually want to put in a keyword class that you call later. In spirit it would look like this:

#include <typeinfo>

map <string, type_info &> keyword_map;

keyword_map.insert (make_pair ("keyword1", typeid(keyword1) )); 
keyword_map.insert (make_pair ("keyword2", typeid(keyword2) )); 
keyword_map.insert (make_pair ("keyword3", typeid(keyword3) ));

      

You might think that later you can call something make_class

with type_info, but that won't work. From here ... the idea of ​​keeping a factory function to get this behavior. I'll give you a simple answer with a static member function, so in each keyword class you would have something like this:

class keyword1 : public baseClass {
    // ...
    static shared_ptr<baseClass> factory() {
        return make_shared<keyword3>();
    }
    // ...
};

      

Because it is a static member, it looks like a regular function. You can take its address, store the pointer, and then call it without any instance of the class to call it. It returns a generic pointer to the base class, and even though everything you end up with is a pointer to the base class ... it will act polymorphically on any virtual functions you define in the base class interface, depending on the kind of keyword it is.

(Note that you need to make your destructors virtual in such cases! It's best to do this by default, and only not if you have a really good reason.)

map <string, shared_ptr<baseClass>(*)()>> keyword_map;

keyword_map.insert (make_pair ("keyword1", &keyword1::factory )); 
keyword_map.insert (make_pair ("keyword2", &keyword2::factory )); 
keyword_map.insert (make_pair ("keyword3", &keyword3::factory ));

      

Now, when you find a keyword, you call the function you are returning from find

to get an instance of the corresponding keyword class. And then do whatever you planned to do with the object instances.

But I think it will be difficult for you to define a base class interface that suits you in such a design. This is again why I said 1000 classes show that you may not have a problem that you want to get closer to what you think. I also think you will have a lot of other questions, but please make them new posts of your own. :-)

+5


source


An unordered_map

will work really fast here. It is implemented as a hashmap, so the search performance is roughly O (1).

std :: unordered_map



In C ++ 11, you can use std::string

both your key and std::function<>

your value. I wrote an example here that shows how to use lambda functions in unordered_map

:

#include <iostream>
#include <unordered_map>
#include <functional>

using namespace std;

typedef unordered_map<string, function<void ()>> ufmap;
ufmap M;

int main() {
    // Create Keys and functions
    string s_a = "A";
    function<void ()> f_a = [](){
        cout << "Construct type A here" << endl;
    };

    string s_b = "B";
    function<void ()> f_b = [](){
        cout << "Construct type B here" << endl;
    };

    // Add Keys and functions to the map
    M.insert(pair<string, function<void ()>>(s_a, f_a));
    M.insert(pair<string, function<void ()>>(s_b, f_b));

    // Finding a string and using its function
    string searching_for = "A";

    ufmap::const_iterator it = M.find(searching_for);

    if (it == M.end()) {
        // String key wasn't found
    }
    else {
        it->second();
    }
}

      

+2


source


Two solutions:

  • Use a lookup table. It's better if your keyword list is dynamic at all and can be O (1) if you are using a hash table.

  • Use a lexical analyzer like flex (1) and create keywords in the .l file. It can be even faster, as it goes to the character in one go without the final search step, but only fits if the keyword list is completely corrected beforehand, for example in a programming language.

+1


source







All Articles