Word count with index

I have to count the first 1o words in a blog post that is being read ... but my code won't allow that to happen. I can't use .split or string isempty or arrays ... which leaves me with indexof and substrings. my code right now only gets the first 3 words ... any help for me there .....

This is what I should be using ....

String getSummary () Method 1. Returns up to the first ten words of the entry as a summary of the entry. If the entry is 10 words or less, the method returns the entire entry. 2. Possible logic. The indexOf method of string classes can find the position of the space. Use this in conjunction with a loop design to find the first 10 words.

public class BlogEntry 
{
    private String username;
    private Date dateOfBlog;
    private String blog;

    public BlogEntry() 
    {
        username = "";
        dateOfBlog = new Date();
        blog = "";
    }

    public BlogEntry(String sName, Date dBlogDate, String sBlog)
    {
        username = sName;
        dateOfBlog = dBlogDate;
        blog = sBlog;
    }

    public String getUsername()
    {
        return username;
    }

    public Date getDateOfBlog()
    {
        return dateOfBlog;
    }

    public String getBlog()
    {
        return blog;
    }

    public void setUsername(String sName)
    {
        username = sName;
    }

    public void setDateOfBlog(Date dBlogDate)
    {
        dateOfBlog.setDate(dBlogDate.getMonth(), dBlogDate.getDay(), dBlogDate.getYear());
    }

    public void setBlog(String sBlog)
    {
        blog = sBlog;
    }

    public String getSummary()
    {
        String summary = "";
        int position;
        int wordCount = 0;
        int start = 0;
        int last;

        position = blog.indexOf(" ");
        while (position != -1 && wordCount < 10)
        {
            summary += blog.substring(start, position) + " ";
            start = position + 1;
            position = blog.indexOf(" ", position + 1);
            wordCount++;
        }

        return summary;
    }

    public String toString()
    {
        return "Author: " + this.getUsername() + "\n\n" + "Date posted: " + this.getDateOfBlog() + "\n\n" + "Text body: " + this.getBlog();
    }
}

      

+3


source to share


5 answers


Add this to your code:

public static void main(String[] args) 
{
    BlogEntry be = new BlogEntry("" , new Date(), "this program is pissing me off!");
    System.out.println( be.getSummary() );        
}

      

Produces this output:

this program is pissing me

      

What's not 3 words, it's 5. You should have 6. And that makes your mistake a lot easier to understand. You are experiencing a typical error in one go . You only add and count the words that appear before the spaces. This leaves the last word as it doesn't appear before the space, only after the last space.

Here's a code close to where you started, can see all 6 words:

public String getSummary()
{
    if (blog == null) 
    {
        return "<was null>";
    }

    String summary = "";
    int position;
    int wordCount = 0;
    int start = 0;
    int last;

    position = blog.indexOf(" ");
    while (position != -1 && wordCount < 10)
    {
        summary += blog.substring(start, position) + " ";
        start = position + 1;
        position = blog.indexOf(" ", position + 1);
        wordCount++;
    }
    if (wordCount < 10) 
    {
        summary += blog.substring(start, blog.length());
    }

    return summary;
}

      



which when testing with this:

public static void main(String[] args) 
{
    String[] testStrings = {
          null //0
        , ""
        , " "
        , "  "
        , " hi"
        , "hi "//5
        , " hi "
        , "this program is pissing me off!"
        , "1 2 3 4 5 6 7 8 9"
        , "1 2 3 4 5 6 7 8 9 "
        , "1 2 3 4 5 6 7 8 9 10"//10
        , "1 2 3 4 5 6 7 8 9 10 "
        , "1 2 3 4 5 6 7 8 9 10 11"
        , "1 2 3 4 5 6 7 8 9 10 11 "
        , "1 2 3 4 5 6 7 8 9 10 11 12"
        , "1 2 3 4 5 6 7 8 9 10 11 12 "//15
    };

    ArrayList<BlogEntry> albe = new ArrayList<>();

    for (String test : testStrings) {
        albe.add(new BlogEntry("" , new Date(), test));
    }

    testStrings[0] = "<was null>";

    for (int i = 0; i < albe.size(); i++ ) {
        assert(albe.get(i).getSummary().equals(testStrings[Math.min(i,11)]));
    }

    for (BlogEntry be : albe)
    {
        System.out.println( be.getSummary() );        
    }
}

      

will produce this:

<was null>



 hi
hi 
 hi 
this program is pissing me off!
1 2 3 4 5 6 7 8 9
1 2 3 4 5 6 7 8 9 
1 2 3 4 5 6 7 8 9 10
1 2 3 4 5 6 7 8 9 10 
1 2 3 4 5 6 7 8 9 10 
1 2 3 4 5 6 7 8 9 10 
1 2 3 4 5 6 7 8 9 10 
1 2 3 4 5 6 7 8 9 10 

      

Also, I don't know where you are importing Date

, but neither import java.util.Date;

will nor import java.sql.Date;

will it make your code a mistake. I had to comment on your code setDate

.

If your instructor allows it, you can of course try the ideas in these other answers, but I thought you wanted to know what's going on.

+2


source


I'm not sure how efficient that would be, but can you just truncate the string every time you grab the index? For example:

TempBlog content:
This test
is a test
test
test



Summary content:
This
is a

test

public String getSummary()
{
    String summary = "";
    int wordCount = 0;
    int last;
    //Create a copy so you don't overwrite original blog
    String tempBlog = blog;

    while (wordCount < 10)
    {
        //May want to check if there is actually a space to read. 
        summary += tempBlog.substring(0, tempBlog.indexOf(" ")) + " ";
        tempBlog = tempBlog.substring(tempBlog.indexOf(" ")+1);
        wordCount++;
    }

    return summary;
}

      

0


source


String.indexOf

and also provides an overload that allows you to search from a specific point ( API link ). With this method, it's pretty easy:

public int countWort(String in , String word){
    int count = 0;

    int index = in.indexOf(word);

    while(index != -1){
        ++count;

        index = in.indexOf(word , index + 1);
    }

    return count;
}

      

0


source


Try this logic ...

public static void main(String[] args) throws Exception {
        public static void main(String[] args) throws Exception {
    String data = "This one sentence has exactly 10 words in it ok";

    int wordIndex = 0;
    int spaceIndex = 0;
    int wordCount = 0;
    while (wordCount < 1 && spaceIndex != -1) {
        spaceIndex = data.indexOf(" ", wordIndex);
        System.out.println(spaceIndex > -1 
                ? data.substring(wordIndex, spaceIndex)
                : data.substring(wordIndex));

        // The next word "should" be right after the space
        wordIndex = spaceIndex + 1;
        wordCount++;
    }
}

      

Results:

This
one
sentence
has
exactly
10
words
in
it
ok

      

UPDATE

Isn't regex

it an option? With help regex

you can try the following:

public static void main(String[] args) throws Exception {
    String data = "The quick brown fox jumps over the lazy dog The quick brown fox jumps over the lazy dog";
    Matcher matcher = Pattern.compile("\\w+").matcher(data);

    int wordCount = 0;
    while (matcher.find() && wordCount < 10) {
        System.out.println(matcher.group());
        wordCount++;
    }
}

      

Results:

The
quick
brown
fox
jumps
over
the
lazy
dog
The

      

The regular expression returns words with the following characters [a-zA-Z_0-9]

0


source


I think we can find the index of the first 10 words by checking if the character is a space character. Here's an example:

public class FirstTenWords
{
    public static void main( String[] args )
    {
        String sentence = "There are ten words in this sentence, I want them to be extracted";
        String summary = firstOf( sentence, 10 );
        System.out.println( summary );
    }

    public static String firstOf( String line, int limit )
    {
        boolean isWordMode = false;
        int count = 0;
        int i;
        for( i = 0; i < line.length(); i++ )
        {
            char character = line.charAt( i );
            if( Character.isSpaceChar( character ) )
            {
                if( isWordMode )
                {
                    isWordMode = false;
                }
            }
            else
            {
                if( !isWordMode )
                {
                    isWordMode = true;
                    count++;
                }
            }
            if( count >= limit )
            {
                break;
            }
        }
        return line.substring( 0, i );
    }
}

      

Output on my laptop:

There are ten words in this sentence, I want 

      

0


source







All Articles