Recovering ProtoBuf Descriptors
Below is some code I wrote that was helpful for recovering proto definition files from encoded descriptor blobs that were embedded in binaries. This uses a DebugString() function that’s available in the C++ binding, but a Python-based implementation of this is available in PBTK (Protobuf Toolkit) by @marin-m. I don’t think my implementation provides any advantage, and I probably would have used PBTK instead if I had found it earlier.
Usage
- Identify the descriptor blob and extract it out of the binary/code as a file.
- Pass the file to the program below as
argv[1]
. - The program will return the definition in stdout.
Code
#include <iostream>
#include <fstream>
#include <google/protobuf/descriptor.h>
#include <google/protobuf/descriptor.pb.h>
#include <google/protobuf/descriptor_database.h>
using namespace std;
using namespace google::protobuf;
int main(int argc, char* argv[])
{
// Read contents of file in argv[1].
ifstream events_file;
events_file.open(argv[1], ios::in | ios::binary | ios::ate);
streampos size = events_file.tellg();
char* events_data = new char[size];
events_file.seekg(0, ios::beg);
events_file.read(events_data, size);
events_file.close();
// Add the raw data to an EncodedDescriptorDatabase.
auto edd = new EncodedDescriptorDatabase();
edd->Add(events_data, size);
// Get the name of the file in the database.
vector<string> files;
edd->FindAllFileNames(&files);
// Get FileDescriptorProto.
auto fdp = new FileDescriptorProto();
edd->FindFileByName(files[0], fdp);
// Build a DescriptorPool with it, to get a FileDescriptor out of it.
auto dp = new DescriptorPool();
auto fd = dp->BuildFile(*fdp);
// Get the debug string out of the FileDescriptor.
std::cout << fd->DebugString();
}