XML::Stream - Creates an XML Stream connection and parses return data
XML::Stream is an attempt at solidifying the use of XML via streaming.
This module provides the user with methods to connect to a remote
server, send a stream of XML to the server, and receive/parse an XML
stream from the server. It is primarily based work for the Etherx XML
router developed by the Jabber Development Team. For more information
about this project visit http://xmpp.org/protocols/streams/.
XML::Stream gives the user the ability to define a central callback
that will be used to handle the tags received from the server. These
tags are passed in the format defined at instantiation time.
the closing tag of an object is seen, the tree is finished and passed
to the call back function. What the user does with it from there is up
to them.
For a detailed description of how this module works, and about the data
structure that it returns, please view the source of Stream.pm and
look at the detailed description at the end of the file.
NOTE: The parser that XML::Stream::Parser provides, as are most Perl
parsers, is synchronous. If you are in the middle of parsing a
packet and call a user defined callback, the Parser is blocked until
your callback finishes. This means you cannot be operating on a
packet, send out another packet and wait for a response to that packet.
It will never get to you. Threading might solve this, but as we all
know threading in Perl is not quite up to par yet. This issue will be
revisted in the future.
new(
debug => string,
debugfh => FileHandle,
debuglevel => 0|1|N,
debugtime => 0|1,
style => string)
Creates the XML::Stream object.
debug should be set to the path for the debug log
to be written. If set to ``stdout'' then the
debug will go there. Also, you can specify
a filehandle that already exists by using
debugfh.
debuglevel determines the amount of debug to generate.
0 is the least, 1 is a little more, N is the limit you want.
debugtime determines wether a timestamp should be preappended
to the entry.
style defines the way the data structure is
returned. The two available styles are:
tree - L<XML::Parser> Tree format
node - L<XML::Stream::Node> format
For more information see the respective man pages.
Starts the stream by listening on a port for someone to connect,
and send the opening stream tag, and then sending a response based
on if the received header was correct for this stream. Server
name, port, and namespace are required otherwise we don't know
where to listen and what namespace to accept.
Accept an incoming connection.
If this is a listening socket then we need to respond to the
opening <stream:stream/>.
Starts the stream by connecting to the server, sending the opening
stream tag, and then waiting for a response and verifying that it
is correct for this stream. Server name, port, and namespace are
required otherwise we don't know where to send the stream to...
Connect(hostname=>string,
port=>integer,
to=>string,
from=>string,
myhostname=>string,
namespace=>string,
namespaces=>array,
connectiontype=>string,
ssl=>0|1,
ssl_verify =>0x00|0x01|0x02|0x04,
ssl_ca_path=>string,
srv=>string)
Opens a tcp connection to the
specified server and sends the proper
opening XML Stream tag. hostname ,
port , and namespace are required.
namespaces allows you to use
XML::Stream::Namespace objects.
to is needed if you want the stream
to attribute to be something other
than the hostname you are connecting
to.
from is needed if you want the
stream from attribute to be something
other than the hostname you are
connecting from.
myhostname should
not be needed but if the module
cannot determine your hostname
properly (check the debug log), set
this to the correct value, or if you
want the other side of the stream to
think that you are someone else. The
type determines the kind of
connection that is made:
"tcpip" - TCP/IP (default)
"stdinout" - STDIN/STDOUT
"http" - HTTP
HTTP recognizes proxies if the ENV
variables http_proxy or https_proxy
are set.
ssl specifies whether an SSL socket
should be used for encrypted co-
mmunications.
ssl_verify determines whether peer
certificate verification takes place.
See the documentation for the
SSL_verify_mode parameter to
the IO::Socket::SSL- manpagenew()|IO::Socket::SSL>.
The default value is 0x01 causing the
server certificate to be verified, and
requiring that ssl_ca_path be set.
ssl_ca_path should be set to the path to
either a directory containing hashed CA
certificates, or a single file containing
acceptable CA certifictes concatenated
together. This parameter is required if
ssl_verify is set to anything other than
0x00 (no verification).
If srv is specified AND Net::DNS is
installed and can be loaded, then
an SRV query is sent to srv.hostname
and the results processed to replace
the hostname and port. If the lookup
fails, or Net::DNS cannot be loaded,
then hostname and port are left alone
as the defaults.
This function returns the same hash from GetRoot()
below. Make sure you get the SID
(Session ID) since you have to use it
to call most other functions in here.
Send the opening stream and save the root element info.
Starts the stream by opening a file and setting it up so that
Process reads from the filehandle to get the incoming stream.
OpenFile(string)
Opens a filehandle to the argument specified, and
pretends that it is a stream. It will ignore the
outer tag, and not check if it was a
<stream:stream/>. This is useful for writing a
program that has to parse any XML file that is
basically made up of small packets (like RDF).
Sends the closing XML tag and shuts down the socket.
Disconnect(sid)
Sends the proper closing XML tag and closes the specified socket down.
Initialize the connection data structure
Takes the incoming stream and makes sure that only full
XML tags gets passed to the parser. If a full tag has not
read yet, then the Stream saves the incomplete part and
sends the rest to the parser.
Checks for data on the socket and returns a status code depending
on if there was data or not. If a timeout is not defined in the
call then the timeout defined in Connect() is used. If a timeout
of 0 is used then the call blocks until it gets some data,
otherwise it returns after the timeout period.
Process(integer)
Waits for data to be available on the socket. If
a timeout is specified then the Process function
waits that period of time before returning nothing.
If a timeout period is not specified then the
function blocks until data is received. The
function returns a hash with session ids as the key,
and status values or data as the hash values.
Takes the data from the server and returns a string
Takes the data string and sends it to the server
Send(sid, string);
Sends the string over the specified connection as is.
This does no checking if valid XML was sent or not.
Best behavior when sending information.
Process the <stream:featutres/> block.
Return the value of the stream feature (if any).
Have we received the stream:features yet?
Process a TLS based packet.
Client function to have the socket start TLS.
Send a <starttls/> in the TLS namespace.
Handle a <proceed/> packet.
Return 1 if the socket is secure, 0 otherwise.
Return 1 if the TLS process is done
return the TLS error if any
Handle a <failure/>
Send a <failure/> in the TLS namespace
Process a SASL based packet.
When we get a <challenge/> we need to do the grunt
work to return a <response/>.
Send an <auth/> in the SASL namespace
Send a <challenge/> in the SASL namespace
This is a helper function to perform all of the required steps for doing SASL with the server.
Return 1 if we authed via SASL, 0 otherwise
Return 1 if the SASL process is finished
Return the error if any
Handle a received <failure/>
handle a received <success/>
Send a <failure/> tag in the SASL namespace
Send a <response/> tag in the SASL namespace
if you are returned an undef, you can call this function
and hopefully learn more information about the problem.
GetErrorCode(sid)
returns a string for the specified session that
will hopefully contain some useful information
about why Process or Connect returned an undef
to you.
Given a type and text, generate a <stream:error/> packet to
send back to the other side.
Takes a host of arguments and sets a portion of the specified
data strucure with that data. The function works in two
modes ``single'' or ``multiple''. ``single'' denotes that the
function should locate the current tag that matches this
data and overwrite it's contents with data passed in.
``multiple'' denotes that a new tag should be created even if
others exist.
type - single or multiple
XMLTree - pointer to XML::Stream data object (tree or node)
tag - name of tag to create/modify (if blank assumes
working with top level tag)
data - CDATA to set for tag
attribs - attributes to ADD to tag
Takes a host of arguments and returns various data structures
that match them.
type existence - returns 1 or 0 if the tag exists in the top level.
value - returns either the CDATA of the tag, or the
value of the attribute depending on which is
sought. This ignores any mark ups to the data
and just returns the raw CDATA.
value array
returns an array of strings representing
all of the CDATA in the specified tag.
This ignores any mark ups to the data
and just returns the raw CDATA.
tree - returns a data structure that represents the
XML with the specified tag as the root tag.
Depends on the format that you are working with.
tree array returns an array of data structures each
with the specified tag as the root tag.
child array - returns a list of all children nodes
not including CDATA nodes.
attribs - returns a hash with the attributes, and
their values, for the things that match
the parameters
count - returns the number of things that match
the arguments
tag - returns the root tag of this tree
XMLTree - pointer to XML::Stream data structure
tag - tag to pull data from. If blank then the top level
tag is accessed.
attrib - attribute value to retrieve. Ignored for types
``value array'', ``tree'', ``tree array''. If paired
with value can be used to filter tags based on
attributes and values.
value - only valid if an attribute is supplied. Used to
filter for tags that only contain this attribute.
Useful to search through multiple tags that all
reference different name spaces.
Run an xpath query on a node and return back the result.
XPath(node,path) returns an array of results that match the xpath.
node can be any of the three types (Tree, Node).
Run an xpath query on a node and return 1 or 0 if the path is
valid.
Takes an XML data tree and turns it into a hash of hashes.
This only works for certain kinds of XML trees like this:
<foo>
<bar>1</bar>
<x>
<y>foo</y>
</x>
<z>5</z>
<z>6</z>
</foo>
The resulting hash would be:
$hash{bar} = 1;
$hash{x}->{y} = "foo";
$hash{z}->[0] = 5;
$hash{z}->[1] = 6;
Good for config files.
Takes a hash and produces an XML string from it. If the hash looks like this:
$hash{bar} = 1;
$hash{x}->{y} = "foo";
$hash{z}->[0] = 5;
$hash{z}->[1] = 6;
The resulting xml would be:
<foo>
<bar>1</bar>
<x>
<y>foo</y>
</x>
<z>5</z>
<z>6</z>
</foo>
Good for config files.
Simple function to make sure that no bad characters make it into
in the XML string that might cause the string to be
misinterpreted.
Simple function to take an escaped string and return it to normal.
Takes one of the data formats that XML::Stream supports and call
the proper BuildXML_xxx function on it.
Return the namespace from the constant string.
Returns the hash of attributes for the root <stream:stream/> tag
so that any attributes returned can be accessed. from and any
xmlns:foobar might be important.
GetRoot(sid)
Returns the attributes that the stream:stream tag sent
by the other end listed in a hash for the specified session.
returns the Socket so that an outside function can access it if desired.
GetSock(sid)
Returns a pointer to the IO::Socket object for the specified session.
Returns a session ID to send to an incoming stream in the return
header. By default it just increments a counter and returns that,
or you can define a function and set it using the SetCallBacks
function.
Takes a hash with top level tags to look for as the keys
and pointers to functions as the values.
SetCallBacks(node=>function, update=>function);
Sets the callback that should be called in various situations.
node is used to handle the data structures that are built for each top level tag.
update is used for when Process is blocking waiting for data, but you
want your original code to be updated.
$NONBLOCKING
Tells the Parser to enter into a nonblocking state. This
might cause some funky behavior since you can get nested
callbacks while things are waiting. 1=on, 0=off(default).
simple example
use XML::Stream qw( Tree );
$stream = XML::Stream->new;
my $status = $stream->Connect(hostname => "jabber.org",
port => 5222,
namespace => "jabber:client");
if (!defined($status)) {
print "ERROR: Could not connect to server\n";
print " (",$stream->GetErrorCode(),")\n";
exit(0);
}
while($node = $stream->Process()) {
# do something with $node
}
$stream->Disconnect();
Example using a handler
use XML::Stream qw( Tree );
$stream = XML::Stream->new;
$stream->SetCallBacks(node=>\&noder);
$stream->Connect(hostname => "jabber.org",
port => 5222,
namespace => "jabber:client",
timeout => undef) || die $!;
# Blocks here forever, noder is called for incoming
# packets when they arrive.
while(defined($stream->Process())) { }
print "ERROR: Stream died (",$stream->GetErrorCode(),")\n";
sub noder
{
my $sid = shift;
my $node = shift;
# do something with $node
}
Tweaked, tuned, and brightness changes by Ryan Eatmon, reatmon@ti.com
in May of 2000.
Colorized, and Dolby Surround sound added by Thomas Charron,
tcharron@jabber.org
By Jeremie in October of 1999 for http://etherx.jabber.org/streams/
Currently maintained by Darian Anthony Patrick.
Copyright (C) 1998-2004 Jabber Software Foundation http://jabber.org/
This module licensed under the LGPL, version 2.1.
|