protobufs.pl -- Google's Protocol Buffers ("protobufs")
Protocol buffers are Google's language-neutral, platform-neutral,
extensible mechanism for serializing structured data -- think XML, but
smaller, faster, and simpler. You define how you want your data to be
structured once. This takes the form of a template that describes the
data structure. You use this template to encode and decode your data
structure into wire-streams that may be sent-to or read-from your peers.
The underlying wire stream is platform independent, lossless, and may be
used to interwork with a variety of languages and systems regardless of
word size or endianness. Techniques exist to safely extend your data
structure without breaking deployed programs that are compiled against
the "old" format.
The idea behind Google's Protocol Buffers is that you define your
structured messages using a domain-specific language and tool
set. Further documentation on this is at
https://developers.google.com/protocol-buffers.
There are two ways you can use protobufs in Prolog:
The protobuf_parse_from_codes/3 and protobuf_serialize_to_codes/3
interface translates between a "wire stream" and a Prolog term. This
interface takes advantage of SWI-Prolog's
dict.
The protoc
plugin (protoc-gen-swipl
) generates a
Prolog file of meta-information that captures the .proto
file's
definition in the protobufs
module, with the following facts:
proto_meta_normalize(Unnormalized, Normalized)
proto_meta_package(Package, FileName, Options)
proto_meta_message_type(Fqn, Package, Name)
proto_meta_message_type_map_entry(Fqn)
proto_meta_field_name(Fqn, FieldNumber, FieldName, FqnName)
proto_meta_field_json_name(FqnName, JsonName)
proto_meta_field_label(FqnName, LabelRepeatOptional) % 'LABEL_OPTIONAL', 'LABEL_REQUIRED', 'LABEL_REPEATED'
proto_meta_field_type(FqnName, Type) % 'TYPE_INT32', 'TYPE_MESSAGE', etc
proto_meta_field_type_name(FqnName, TypeName)
proto_meta_field_default_value(FqnName, DefaultValue)
proto_meta_field_option_packed(FqnName)
proto_meta_enum_type(FqnName, Fqn, Name)
proto_meta_enum_value(FqnName, Name, Number)
proto_meta_field_oneof_index(FqnName, Index)
proto_meta_oneof(FqnName, Index, Name)
The protobuf_message/2 interface allows you to define your message
template as a list of predefined
Prolog terms that correspond to production rules in the Definite Clause
Grammar (DCG) that realizes the interpreter. Each production rule has an
equivalent rule in the protobuf grammar. The process is not unlike
specifiying the format of a regular expression. To encode a template to
a wire-stream, you pass a grounded template, X
, and variable, Y
, to
protobuf_message/2. To decode a wire-stream, Y
, you pass an ungrounded
template, X
, along with a grounded wire-stream, Y
, to
protobuf_message/2. The interpreter will unify the unbound variables in
the template with values decoded from the wire-stream.
For an overview and tutorial with examples, see
library(protobufs)
: Google's Protocol Buffers
Examples of usage may also be found by inspecting
test_protobufs.pl
and the
demo
directory, or by looking at the "addressbook" example that is typically
installed at
/usr/lib/swi-prolog/doc/packages/examples/protobufs/interop/addressbook.pl
- author
- - Jeffrey Rosenwald (JeffRose@acm.org)
- - Peter Ludemann (peter.ludemann@gmail.org)
- See also
- - https://developers.google.com/protocol-buffers
- - https://developers.google.com/protocol-buffers/docs/encoding
- Compatibility
- - SWI-Prolog
- protobuf_parse_from_codes(+WireCodes:list(int), +MessageType:atom, -Term) is semidet
- Process bytes (list of int) that is the serialized form of a message (designated
by
MessageType
), creating a Prolog term.
Protoc
must have been run (with the --swipl_out=
option and the resulting
top-level _pb.pl
file loaded. For more details, see the "protoc" section of the
overview documentation.
Fails if the message can't be parsed or if the appropriate meta-data from protoc
hasn't been loaded.
All fields that are omitted from the WireCodes
are set to their
default values (typically the empty string or 0, depending on the
type; or []
for repeated groups). There is no way of testing
whether a value was specified in WireCodes
or given its default
value (that is, there is no equivalent of the Python
implementation's =HasField`). Optional embedded messages and groups
do not have any default value -- you must check their existence by
using get_dict/3 or similar. If a field is part of a "oneof" set,
then none of the other fields is set. You can determine which field
had a value by using get_dict/3.
- Arguments:
-
WireCodes | - Wire format of the message from e.g., read_stream_to_codes/2.
(The stream should have options encoding(octet) and type(binary) ,
either as options to read_file_to_codes/3 or by calling set_stream/2
on the stream to read_stream_to_codes/2.) |
MessageType | - Fully qualified message name (from the .proto file's package and message ).
For example, if the package is google.protobuf and the
message is FileDescriptorSet , then you would use
'.google.protobuf.FileDescriptorSet' or 'google.protobuf.FileDescriptorSet' .
If there's no package name, use e.g.: 'MyMessage or '.MyMessage' .
You can see the packages by looking at
protobufs:proto_meta_package(Pkg,File,_)
and the message names and fields by
protobufs:proto_meta_field_name('.google.protobuf.FileDescriptorSet',
FieldNumber, FieldName, FqnName) (the initial '.' is not optional for these facts,
only for the top-level name given to protobuf_serialize_to_codes/3). |
Term | - The generated term, as nested dicts. |
- Errors
- -
version_error(Module-Version)
you need to recompile the Module
with a newer version of protoc
.
- See also
- -
library(protobufs)
: Google's Protocol Buffers
- bug
- - Ignores
.proto
extensions. - -
map
fields don't get special treatment (but see protobuf_map_pairs/3). - - Generates fields in a different order from the C++, Python,
Java implementations, which use the field number to determine
field order whereas currently this implementation uses field
name. (This isn't stricly speaking a bug, because it's allowed
by the specification; but it might cause some surprise.)
- To be done
- - document the generated terms (see library(http/json) and json_read_dict/3)
- - add options such as
true
and value_string_as
(similar to json_read_dict/3) - - add option for form of the dict tags (fully qualified or not)
- - add option for outputting fields in the C++/Python/Java order
(by field number rather than by field name).
- protobuf_serialize_to_codes(+Term:dict, -MessageType:atom, -WireCodes:list(int)) is det
- Process a Prolog term into bytes (list of int) that is the serialized form of a
message (designated by
MessageType
).
Protoc
must have been run (with the --swipl_out=
option and the resulting
top-level _pb.pl
file loaded. For more details, see the "protoc" section of the
overview documentation.
Fails if the term isn't of an appropriate form or if the appropriate
meta-data from protoc
hasn't been loaded, or if a field name is incorrect
(and therefore nothing in the meta-data matches it).
- Arguments:
-
Term | - The Prolog form of the data, as nested dicts. |
MessageType | - Fully qualified message name (from the .proto file's package and message ).
For example, if the package is google.protobuf and the
message is FileDescriptorSet , then you would use
'.google.protobuf.FileDescriptorSet' or 'google.protobuf.FileDescriptorSet' .
If there's no package name, use e.g.: 'MyMessage or '.MyMessage' .
You can see the packages by looking at
protobufs:proto_meta_package(Pkg,File,_)
and the message names and fields by
protobufs:proto_meta_field_name('.google.protobuf.FileDescriptorSet',
FieldNumber, FieldName, FqnName) (the initial '.' is not optional for these facts,
only for the top-level name given to protobuf_serialize_to_codes/3). |
WireCodes | - Wire format of the message, which can be output using
format('~s', [WireCodes]) . |
- Errors
- -
version_error(Module-Version)
you need to recompile the Module
with a newer version of protoc
. - - existence_error if a field can't be found in the meta-data
- See also
- -
library(protobufs)
: Google's Protocol Buffers
- bug
- -
map
fields don't get special treatment (but see protobuf_map_pairs/3). - -
oneof
is not checked for validity.
- protobuf_message(?Template, ?WireStream) is semidet
- protobuf_message(?Template, ?WireStream, ?Rest) is nondet
- Marshals and unmarshals byte streams encoded using Google's
Protobuf grammars. protobuf_message/2 provides a bi-directional
parser that marshals a Prolog structure to WireStream, according
to rules specified by Template. It can also unmarshal WireStream
into a Prolog structure according to the same grammar.
protobuf_message/3 provides a difference list version.
- Arguments:
-
Template | - is a protobuf grammar specification. On decode,
unbound variables in the Template are unified with their respective
values in the WireStream. On encode, Template must be ground. |
WireStream | - is a code list that was generated by a protobuf
encoder using an equivalent template. |
- bug
- - The protobuf specification states that the wire-stream can have
the fields in any order and that unknown fields are to be ignored.
This implementation assumes that the fields are in the exact order
of the definition and match exactly. If you use
protobuf_parse_from_codes/3, you can avoid this problem.o
- protobuf_field_is_map(+MessageType, +FieldName) is semidet
- Succeeds if
MessageType
's FieldName
is defined as a map<...> in
the .proto file.
- protobuf_map_pairs(+ProtobufTermList:list, ?DictTag:atom, ?Pairs) is det
- Convert between a list of protobuf map entries (in the form
DictTag{key:Key, value:Value}
and a key-value list as described
in library(pairs). At least one of ProtobufTermList
and Pairs
must be instantiated; DictTag
can be uninstantiated. If
ProtobufTermList
is from a term created by
protobuf_parse_from_codes/3, the ordering of the items is undefined;
you can order them by using keysort/2 (or by a predicate such as
dict_pairs/3, list_to_assoc/2, or list_to_rbtree/2.
Re-exported predicates
The following predicates are exported from this file while their implementation is defined in imported modules or non-module files loaded by this module.
- protobuf_message(?Template, ?WireStream) is semidet
- protobuf_message(?Template, ?WireStream, ?Rest) is nondet
- Marshals and unmarshals byte streams encoded using Google's
Protobuf grammars. protobuf_message/2 provides a bi-directional
parser that marshals a Prolog structure to WireStream, according
to rules specified by Template. It can also unmarshal WireStream
into a Prolog structure according to the same grammar.
protobuf_message/3 provides a difference list version.
- Arguments:
-
Template | - is a protobuf grammar specification. On decode,
unbound variables in the Template are unified with their respective
values in the WireStream. On encode, Template must be ground. |
WireStream | - is a code list that was generated by a protobuf
encoder using an equivalent template. |
- bug
- - The protobuf specification states that the wire-stream can have
the fields in any order and that unknown fields are to be ignored.
This implementation assumes that the fields are in the exact order
of the definition and match exactly. If you use
protobuf_parse_from_codes/3, you can avoid this problem.o