Parsing HTTP headers in C ++

I am using curl to communicate with the server.

When I make a request for data, I get HTTP headers followed by jpeg data separated by a border like this:

enter image description here

I need to disassemble

  • Border line
  • Content-length.

I copied the incoming data into a char array like this:

static size_t OnReceiveData ( void * pvData, size_t tSize, size_t tCount, void * pvUser )
{
    printf("%*.*s", tSize * tCount, tSize * tCount, pvData);

    char* _data;
    if(pvData != nullptr && 0 != tCount)
    {
        _data = new char[tCount];
       memcpy(_data, pvData, tCount);
    }

    return ( tCount );
}

      

How can I do this best in C ++? How can I check and parse the _data array for the information I want? Are there any additional libraries I can use for example?

+3


source to share


3 answers


You can parse headers on the fly, or put them on a map and process them later. Use the methods find

, substr

of the std ::: string . Take a look at the "Library of Pulse Acceleration Algorithms" , it contains many algorithms, for example.trim

eg. for placing headers in std::map

and printing them (rough cuts):

#include <stdlib.h>
#include <iostream>
#include <sstream>
#include <string>
#include <map>
#include <boost/algorithm/string.hpp>

int main(int argc, char* argv[]) {
  const char* s = "HTTP/1.1 200 OK\r\n"
    "Content-Type: image/jpeg; charset=utf-8\r\n"
    "Content-Length: 19912\r\n\r\n";

  std::map<std::string, std::string> m;

  std::istringstream resp(s);
  std::string header;
  std::string::size_type index;
  while (std::getline(resp, header) && header != "\r") {
    index = header.find(':', 0);
    if(index != std::string::npos) {
      m.insert(std::make_pair(
        boost::algorithm::trim_copy(header.substr(0, index)), 
        boost::algorithm::trim_copy(header.substr(index + 1))
      ));
    }
  }

  for(auto& kv: m) {
    std::cout << "KEY: `" << kv.first << "`, VALUE: `" << kv.second << '`' << std::endl;
  }

  return EXIT_SUCCESS;
}

      



You will get the result:

KEY: `Content-Length`, VALUE: `19912`
KEY: `Content-Type`, VALUE: `image/jpeg; charset=utf-8`

      

With the headers in hand, you can extract the headers you want for further processing.

+4


source


The cpp-netlib project (based on boost) contains a complete MIME parser (written with boost.spirit).



I'm not very happy with the parser interface, but it works well.

+1


source


I would put all the titles in the map, after which you can scroll easily. No promotion required. Here's a basic working example with libcurl:

#include <iostream>
#include <string>
#include <map>
#include <curl/curl.h>

static size_t OnReceiveData (void * pData, size_t tSize, size_t tCount, void * pmUser)
{
    size_t length = tSize * tCount, index = 0;
    while (index < length)
    {
        unsigned char *temp = (unsigned char *)pData + index;
        if ((temp[0] == '\r') || (temp[0] == '\n'))
            break;
        index++;
    }

    std::string str((unsigned char*)pData, (unsigned char*)pData + index);
    std::map<std::string, std::string>* pmHeader = (std::map<std::string, std::string>*)pmUser;
    size_t pos = str.find(": ");
    if (pos != std::string::npos)
        pmHeader->insert(std::pair<std::string, std::string> (str.substr(0, pos), str.substr(pos + 2)));

    return (tCount);
}

int main(int argc, char* argv[])
{
    CURL *curl = curl_easy_init();
    if (!curl)
        return 1;

    std::map<std::string, std::string> mHeader;

    curl_easy_setopt(curl, CURLOPT_URL, "http://www.example.com");
    curl_easy_setopt(curl, CURLOPT_HEADERFUNCTION, OnReceiveData);
    curl_easy_setopt(curl, CURLOPT_HEADERDATA, &mHeader);
    curl_easy_setopt(curl, CURLOPT_NOBODY, true);
    curl_easy_perform(curl);
    curl_easy_cleanup(curl);

    std::map<std::string, std::string>::const_iterator itt;
    for (itt = mHeader.begin(); itt != mHeader.end(); itt++)
    {
        if (itt->first == "Content-Type" || itt->first == "Content-Length")
            std::cout << itt->first << ": " << itt->second << std::endl;
    }
}

      

+1


source







All Articles