Youtube api v3 page tokens

I am using search api and using nextpagetoken for paging. But I cannot get all the results this way. I can only get 500 results out of about 455,000 results.

Here's the java code to find the search results:

youtube = new YouTube.Builder(Auth.HTTP_TRANSPORT, Auth.JSON_FACTORY, new HttpRequestInitializer() {public void initialize(HttpRequest request) throws IOException {}           }).setApplicationName("youtube-search").build();

YouTube.Search.List search = youtube.search().list("id,snippet");
String apiKey = properties.getProperty("youtube.apikey");
search.setKey(apiKey);
search.setType("video");
search.setMaxResults(50);
search.setQ(queryTerm);
boolean allResultsRead = false;
while (! allResultsRead){
SearchListResponse searchResponse = search.execute();
System.out.println("Printed " +  searchResponse.getPageInfo().getResultsPerPage() + " out of " + searchResponse.getPageInfo().getTotalResults() + ". Current page token: " + search.getPageToken() + "Next page token: " + searchResponse.getNextPageToken() + ". Prev page token" + searchResponse.getPrevPageToken());
if (searchResponse.getNextPageToken() == null)
{
    allResultsRead = true;                          
    search = youtube.search().list("id,snippet");
    search.setKey(apiKey);
    search.setType("video");
    search.setMaxResults(50);
}
else
{
   search.setPageToken(searchResponse.getNextPageToken());
}}

      

Output signal

Printed 50 out of 455085. Current page token: null Next page token: CDIQAA. Prev page token null
Printed 50 out of 454983. Current page token: CDIQAA Next page token: CGQQAA. Prev page token CDIQAQ
Printed 50 out of 455081. Current page token: CGQQAA Next page token: CJYBEAA. Prev page token CGQQAQ
Printed 50 out of 454981. Current page token: CJYBEAA Next page token: CMgBEAA. Prev page token CJYBEAE
Printed 50 out of 455081. Current page token: CMgBEAA Next page token: CPoBEAA. Prev page token CMgBEAE
Printed 50 out of 454981. Current page token: CPoBEAA Next page token: CKwCEAA. Prev page token CPoBEAE
Printed 50 out of 455081. Current page token: CKwCEAA Next page token: CN4CEAA. Prev page token CKwCEAE
Printed 50 out of 454980. Current page token: CN4CEAA Next page token: CJADEAA. Prev page token CN4CEAE
Printed 50 out of 455081. Current page token: CJADEAA Next page token: CMIDEAA. Prev page token CJADEAE
Printed 50 out of 455081. Current page token: CMIDEAA Next page token: null. Prev page token CMIDEAE

      

After 10 iterations through the while loop, it terminates because the next page token is zero.

I'm new to the Yotube API and not sure what I am doing wrong here. I have two questions: 1. How can I get all the results? 2. Why is the previous page token for page 3 not the same as the current page 2 token?

Any help would be appreciated. Thank!

+3


source to share


3 answers


You experience what is meant to be; using nextPageToken you can get up to 500 results. If you're wondering how this happened, you can read this thread:

https://code.google.com/p/gdata-issues/issues/detail?id=4282

But as a summary of this stream, it basically boils down to the fact that with so much data on YouTube, the search algorithms are radically different from most people. It's not just a simple search engine job of finding content in the fields, but there is an incredible amount of signals that are processed to make the results relevant, and after about 500 results, the algorithms begin to lose the ability to make the results worthwhile.

One thing that helped me ponder this is to understand that when YouTube talks about search, they are talking about probability, not a match, so the results are ordered based on your parameters in terms of their likelihood of being relevant to your query. As you paginate, you eventually reach a point where, statistically speaking, the likelihood of relevance is low enough that its computational value does not allow those results to be returned. So 500 is the limit.



(Also note that the number of "results" is not an approximation of matches, it is an approximation of potential matches, but then when you start extracting them, many of these possible matches are discarded as not relevant at all ... so the number is not really means people think it is. Google search is similar.)

You may be wondering why YouTube's search function works this way rather than doing more traditional string-data matching; with that much search volume, if they really did a full search of all data for each request, you would wait a minute at a time, if not more. It is truly a technical marvel, if you think about it, how algorithms can get such relevant results for the top 500 when they work on prediction, probability, etc.

Regarding your second question, the page tokens are not a unique set of results, but a kind of algorithmic state and thus indicate your request, the progress of the request, and the direction of the request ... so for example for iteration 3 are referenced as nextPageToken iteration 2 and prevPageToken iteration 4, but the two tokens are slightly different so they can indicate the direction they came from.

+17


source


I see that you are not including "nextPageToken" in setFields.

For example:



public class ABC {
private YouTube youtube;
private YouTube.Search.List query;

public static final String KEY = "YOUR API KEY";

public YoutubeConnector(Context context) {
    youtube = new YouTube.Builder(new NetHttpTransport(), new JacksonFactory(), new HttpRequestInitializer() {
        @Override
        public void initialize(HttpRequest httpRequest) throws IOException {
        }
    }).setApplicationName(context.getString(R.string.app_name)).build();

    try {
        query = youtube.search().list("id,snippet");
        query.setMaxResults(Long.parseLong("10"));
        query.setKey(KEY);
        query.setType("video");
        query.setFields("items(id/videoId,snippet/title,snippet/description,snippet/thumbnails/default/url),nextPageToken");
    } catch (IOException e) {
        Log.d("YC", "Could not initialize: " + e.getMessage());
    }
}

public List<VideoItem> search(String keywords) {
    query.setQ(keywords);
        try {
            List<VideoItem> items = new ArrayList<VideoItem>();
            String nextToken = "";
            int i = 0;
            do {
                query.setPageToken(nextToken);
                SearchListResponse response = query.execute();
                List<SearchResult> results = response.getItems();
                for (SearchResult result : results) {
                    VideoItem item = new VideoItem();
                    item.setTitle(result.getSnippet().getTitle());
                        item.setDescription(result.getSnippet().getDescription());
                    item.setThumbnailURL(result.getSnippet().getThumbnails().getDefault().getUrl());
                    item.setId(result.getId().getVideoId());
                    items.add(item);
                }
                nextToken = response.getNextPageToken();
                i ++;
                System.out.println("nextToken :  "+ nextToken);
            } while (nextToken != null && i < 20);

            return items;
        } catch (IOException e) {
            Log.d("YC", "Could not search: " + e);
            return null;
        }

}
}

      

Hope this helps you.

+3


source


you can pass nextpagetoken page and pass it as pagetoken parameter

this will display the nex page, i will write a vardamp to show you the page token is not the same, just copy this code and run it and make sure you put the api resource folder in the same folder of your plugin

    <?php
    function doit(){if (isset($_GET['q']) && $_GET['maxResults'] ) {
      // Call set_include_path() as needed to point to your client library.
     // require_once ($_SERVER["DOCUMENT_ROOT"].'/API/youtube/google-api-php-client/src/Google_Client.php');
     // require_once ($_SERVER["DOCUMENT_ROOT"].'/API/youtube/google-api-php-client/src/contrib/Google_YouTubeService.php');
      set_include_path("./google-api-php-client/src");
      require_once 'Google_Client.php';
      require_once 'contrib/Google_YouTubeService.php';
      /* Set $DEVELOPER_KEY to the "API key" value from the "Access" tab of the
      Google APIs Console <http://code.google.com/apis/console#access>
      Please ensure that you have enabled the YouTube Data API for your project. */
      $DEVELOPER_KEY = 'AIzaSyCgHHDrx5ufQlkXcSc8nm5uqrsNdXizbMs';

                        //  the old one    AIzaSyDOkg-u9jnhP-WnzX5WPJyV1sc5QQrtuyc 



    $client = new Google_Client();
      $client->setDeveloperKey($DEVELOPER_KEY);

      $youtube = new Google_YoutubeService($client);

      try {
        $searchResponse = $youtube->search->listSearch('id,snippet', array(
          'q' => $_GET['q'],
          'maxResults' => $_GET['maxResults'],

    ));
    var_dump($searchResponse);


    $searchResponse2 = $youtube->search->listSearch('id,snippet', array(
      'q' => $_GET['q'],
      'maxResults' => $_GET['maxResults'],
      'pageToken' => $searchResponse['nextPageToken'],
    ));
    var_dump($searchResponse2);
    exit;


    $videos = '';
    $channels = '';
      foreach ($searchResponse['items'] as $searchResult) {
          switch ($searchResult['id']['kind']) {
         case 'youtube#video':

          $videoId =$searchResult['id']['videoId'];
          $title = $searchResult['snippet']['title'];
          $publishedAt= $searchResult['snippet']['publishedAt'];
          $description = $searchResult['snippet']['description'];
          $iamge_url =  $searchResult['snippet'] ['thumbnails']['default']['url'];
          $image_high  = $searchResult['snippet'] ['thumbnails']['high']['url'];




          echo  '<div class="souligne" id="'.$videoId.'">

            <div >
            <a href=http://www.youtube.com/watch?v='.$videoId.' target=_blank"  >
            <img src="'.$iamge_url .'"   width ="150px" /> 
            </a> 
            </div>
            <div class="title">'.$title.'</div>
            <div class="des"> '.$description.' </div>
            <a id="'.$videoId.'" onclick="supp(this)" class="linkeda"> 
                + ADD
            </a>                
            </div>'
            ;
          break;
      }
    }
    echo  ' </ul></form>';

       } catch (Google_ServiceException $e) {
        $htmlBody .= sprintf('<p>A service error occurred: <code>%s</code></p>',
          htmlspecialchars($e->getMessage()));
      } catch (Google_Exception $e) {
        $htmlBody .= sprintf('<p>An client error occurred: <code>%s</code></p>',
          htmlspecialchars($e->getMessage()));
      }
    }}
       doit();
    ?>
    <!doctype html>
    <html>
      <head>
        <title>YouTube Search</title>
    <link href="//www.w3resource.com/includes/bootstrap.css" rel="stylesheet">
    <style type="text/css">
    body{margin-top: 50px; margin-left: 50px}
    </style>
      </head>
      <body>
        <form method="GET">
      <div>
        Search Term: <input type="search" id="q" name="q" placeholder="Enter Search Term">
      </div>
      <div>

        Max Results: <input type="number" id="maxResults" name="maxResults" min="1" max="1000000" step="1" value="25">
      </div>
      <div>
        page: <input type="number" id="startIndex" name="startIndex" min="1" max="50" step="1" value="2">
      </div>
      <input type="submit" value="Search">
    </form>

<h3>Videos</h3>
    <ul><?php if(isset($videos))echo $videos; ?></ul>
    <h3>Channels</h3>
    <ul><?php if(isset($channels)) echo $channels; ?></ul>
</body>
</html>

      

0


source







All Articles