From 6aaff53441a28a9cc6db7bf4c4bd067c343b9f2f Mon Sep 17 00:00:00 2001 From: Charlie Savage Date: Mon, 8 Jan 2024 11:43:55 -0800 Subject: [PATCH] Update README.rdoc to remove old code and update status build badge. --- README.rdoc | 425 +++++++++++++++++++++++++--------------------------- 1 file changed, 208 insertions(+), 217 deletions(-) diff --git a/README.rdoc b/README.rdoc index 96303673..3acd1f42 100644 --- a/README.rdoc +++ b/README.rdoc @@ -1,217 +1,208 @@ -= LibXML Ruby - -== Overview -The libxml gem provides Ruby language bindings for GNOME's Libxml2 -XML toolkit. It is free software, released under the MIT License. - -We think libxml-ruby is the best XML library for Ruby because: - -* Speed - Its much faster than REXML and Hpricot -* Features - It provides an amazing number of featues -* Conformance - It passes all 1800+ tests from the OASIS XML Tests Suite - -== Requirements -libxml-ruby requires Ruby 1.8.7 or higher. It depends on libxml2 to -function properly. libxml2, in turn, depends on: - -* libm (math routines: very standard) -* libz (zlib) -* libiconv - -If you are running Linux or Unix you'll need a C compiler so the -extension can be compiled when it is installed. If you are running -Windows, then install the x64-mingw32 gem or build it yourself using -Devkit[http://rubyinstaller.org/add-ons/devkit/] or -msys2[https://msys2.github.io/]. - -== Installation -The easiest way to install libxml-ruby is via RubyGems. To install: - -gem install libxml-ruby - -If the extension compile process cannot find libxml2, you may need to indicate -the location of the libxml2 configuration utility as it is used to find the -required header and include files. (If you need to indicate a location for the -libxml2 library or header files different than reported by xml2-config, -see the additional configuration options.) - -This may be done with RubyGems: - -gem install libxml-ruby -- --with-xml2-dir=/path/to/xml2-config - -Or bundler: - -bundle config build.libxml-ruby --with-xml2-config=/path/to/xml2-config - -bundle install libxml-ruby - -If you are running Windows, then install the libxml-ruby-x64-mingw32 gem. -The gem includes prebuilt extensions for Ruby 2.3. These -extensions are built using MinGW64 and libxml2 version 2.9.3, -iconv version 1.14 and zlib version 1.2.8. Note these binaries -are available in the lib\\libs directory. To use them, put them -on your PATH. - -The gem also includes a Microsoft VC++ 2012 solution and XCode 5 project - these -are very useful for debugging. - -libxml-ruby's source codes lives on GitHub[https://github.com/xml4r/libxml-ruby]. - -== Getting Started -Using libxml is easy. First decide what parser you want to use: - -* Generally you'll want to use the LibXML::XML::Parser which provides a tree based API. -* For larger documents that don't fit into memory, or if you prefer an input based API, use the LibXML::XML::Reader. -* To parse HTML files use LibXML::XML::HTMLParser. -* If you are masochistic, then use the LibXML::XML::SaxParser, which provides a callback API. - -Once you have chosen a parser, choose a datasource. Libxml can parse files, strings, URIs -and IO streams. For each data source you can specify an LibXML::XML::Encoding, a base uri and -various parser options. For more information, refer the LibXML::XML::Parser.document, -LibXML::XML::Parser.file, LibXML::XML::Parser.io or LibXML:::XML::Parser.string methods (the -same methods are defined on all four parser classes). - -== Advanced Functionality -Beyond the basics of parsing and processing XML and HTML documents, -libxml provides a wealth of additional functionality. - -Most commonly, you'll want to use its LibXML::XML::XPath support, which makes -it easy to find data inside an XML document. Although not as popular, -LibXML::XML::XPointer provides another API for finding data inside an XML document. - -Often times you'll need to validate data before processing it. For example, -if you accept user generated content submitted over the Web, you'll -want to verify that it does not contain malicious code such as embedded scripts. -This can be done using libxml's powerful set of validators: - -* DTDs (LibXML::XML::Dtd) -* Relax Schemas (LibXML::XML::RelaxNG) -* XML Schema (LibXML::XML::Schema) - -Finally, if you'd like to use XSL Transformations to process data, then install -the {libxslt gem}[https://github.com/xml4r/libxslt-rubygem]. - -== Usage -For information about using libxml-ruby please refer to its -documentation[http://xml4r.github.io/libxml-ruby]. Some tutorials are also -available[https://github.com/xml4r/libxml-ruby/wiki]. - -All libxml classes are in the LibXML::XML module. The easiest -way to use libxml is to require 'xml'. This will mixin -the LibXML module into the global namespace, allowing you to -write code like this: - - require 'xml' - document = XML::Document.new - -However, when creating an application or library you plan to -redistribute, it is best to not add the LibXML module to the global -namespace, in which case you can either write your code like this: - - require 'libxml' - document = LibXML::XML::Document.new - -Or you can utilize a namespace for your own work and include LibXML into it. -For example: - - require 'libxml' - - module MyApplication - include LibXML - - class MyClass - def some_method - document = XML::Document.new - end - end - end - -For simplicity's sake, the documentation uses the xml module in its examples. - -== Threading -libxml-ruby fully supports native, background Ruby threads. This of course -only applies to Ruby 1.9.x and higher since earlier versions of Ruby do not -support native threads. - -== Tests - -To run tests you first need to build the shared libary: - - rake compile - -Once you have build the shared libary, you can then run tests using rake: - - rake test - -+Travis build status: {Build Status}[https://travis-ci.org/xml4r/libxml-ruby] - -== Performance - -In addition to being feature rich and conformation, the main reason -people use libxml-ruby is for performance. Here are the results -of a couple simple benchmarks recently blogged about on the -Web (you can find them in the benchmark directory of the -libxml distribution). - -From http://depixelate.com/2008/4/23/ruby-xml-parsing-benchmarks - - user system total real - libxml 0.032000 0.000000 0.032000 ( 0.031000) - Hpricot 0.640000 0.031000 0.671000 ( 0.890000) - REXML 1.813000 0.047000 1.860000 ( 2.031000) - -From https://svn.concord.org/svn/projects/trunk/common/ruby/xml_benchmarks/ - - user system total real - libxml 0.641000 0.031000 0.672000 ( 0.672000) - hpricot 5.359000 0.062000 5.421000 ( 5.516000) - rexml 22.859000 0.047000 22.906000 ( 23.203000) - - -== Documentation -Documentation is available via rdoc, and is installed automatically with the -gem. - -libxml-ruby's {online -documentation}[https://xml4r.github.io/libxml-ruby/rdoc/index.html] is generated -using Hanna, which is a development gem dependency. - -Note that older versions of Rdoc, which ship with Ruby 1.8.x, will report -a number of errors. To avoid them, install Rdoc 2.1 or higher. Once you have -installed the gem, you'll have to disable the version of Rdoc that Ruby 1.8.x -includes. An easy way to do that is rename the directory -ruby/lib/ruby/1.8/rdoc to -ruby/lib/ruby/1.8/rdoc_old. - -== Support -If you have any questions about using libxml-ruby, please report an issue -on GitHub[https://github.com/xml4r/libxml-ruby/issues]. - -== Memory Management -libxml-ruby automatically manages memory associated with the -underlying libxml2 library. The bindings create a one-to-one mapping between -Ruby objects and libxml documents and libxml parent nodes (ie, nodes that do not -have a parent and do not belong to a document). In these cases, -the bindings manage the memory. They do this by installing a free -function and storing a back pointer to the Ruby object from the xmlnode -using the _private member on libxml structures. When the Ruby object -goes out of scope, the underlying libxml structure is freed. Libxml -itself then frees all child nodes (recursively). - -For all other nodes (the vast majority), the bindings create temporary -Ruby objects that get freed once they go out of scope. Thus there can be -more than one Ruby object pointing to the same xml node. To mostly hide -this from a programmer on the Ruby side, the #eql? and #== methods are -overriden to check if two Ruby objects wrap the same xmlnode. If they do, -then the methods return true. During the mark phase, each of these temporary -objects marks its owning document, thereby keeping the Ruby document object -alive and thus the xmldoc tree. - -In the sweep phase of the garbage collector, or when a program ends, -there is no order to how Ruby objects are freed. In fact, the Ruby document -object is almost always freed before any Ruby objects that wrap child nodes. -However, this is ok because those Ruby objects do not have a free function -and are no longer in scope (since if they were the document would not be freed). - -== License -See LICENSE for license information. += LibXML Ruby + +== Overview +The libxml gem provides Ruby language bindings for GNOME's Libxml2 +XML toolkit. It is free software, released under the MIT License. + +We think libxml-ruby is the best XML library for Ruby because: + +* Speed - Its much faster than REXML and Hpricot +* Features - It provides an amazing number of featues +* Conformance - It passes all 1800+ tests from the OASIS XML Tests Suite + +== Requirements +libxml-ruby requires Ruby 3.0.0 or higher. It depends on libxml2 to +function properly. libxml2, in turn, depends on: + +* libm (math routines: very standard) +* libz (zlib) +* libiconv + +If you are running Linux or Unix you'll need a C compiler so the +extension can be compiled when it is installed. If you are running +Windows, then install the x64-mingw-ucr gem or build it yourself using (Ruby +for Windows)[https://rubyinstaller.org/] or directly with msys2[https://msys2.github.io/] +and ucrt64. + +== Installation +The easiest way to install libxml-ruby is via RubyGems. To install: + +gem install libxml-ruby + +If the extension compile process cannot find libxml2, you may need to indicate +the location of the libxml2 configuration utility as it is used to find the +required header and include files. (If you need to indicate a location for the +libxml2 library or header files different than reported by xml2-config, +see the additional configuration options.) + +This may be done with RubyGems: + +gem install libxml-ruby -- --with-xml2-dir=/path/to/xml2-config + +Or bundler: + +bundle config build.libxml-ruby --with-xml2-config=/path/to/xml2-config + +bundle install libxml-ruby + +If you are running Windows, then install the libxml-ruby-x64-mingw32 gem. +The gem includes prebuilt extensions for Ruby 3.2 and 3.3. + +The gem also includes a Microsoft VC++ solution and XCode project - these +are very useful for debugging. + +libxml-ruby's source codes lives on GitHub[https://github.com/xml4r/libxml-ruby]. + +== Getting Started +Using libxml is easy. First decide what parser you want to use: + +* Generally you'll want to use the LibXML::XML::Parser which provides a tree based API. +* For larger documents that don't fit into memory, or if you prefer an input based API, use the LibXML::XML::Reader. +* To parse HTML files use LibXML::XML::HTMLParser. +* If you are masochistic, then use the LibXML::XML::SaxParser, which provides a callback API. + +Once you have chosen a parser, choose a datasource. Libxml can parse files, strings, URIs +and IO streams. For each data source you can specify an LibXML::XML::Encoding, a base uri and +various parser options. For more information, refer the LibXML::XML::Parser.document, +LibXML::XML::Parser.file, LibXML::XML::Parser.io or LibXML:::XML::Parser.string methods (the +same methods are defined on all four parser classes). + +== Advanced Functionality +Beyond the basics of parsing and processing XML and HTML documents, +libxml provides a wealth of additional functionality. + +Most commonly, you'll want to use its LibXML::XML::XPath support, which makes +it easy to find data inside an XML document. Although not as popular, +LibXML::XML::XPointer provides another API for finding data inside an XML document. + +Often times you'll need to validate data before processing it. For example, +if you accept user generated content submitted over the Web, you'll +want to verify that it does not contain malicious code such as embedded scripts. +This can be done using libxml's powerful set of validators: + +* DTDs (LibXML::XML::Dtd) +* Relax Schemas (LibXML::XML::RelaxNG) +* XML Schema (LibXML::XML::Schema) + +Finally, if you'd like to use XSL Transformations to process data, then install +the {libxslt gem}[https://github.com/xml4r/libxslt-rubygem]. + +== Usage +For information about using libxml-ruby please refer to its +documentation[http://xml4r.github.io/libxml-ruby]. Some tutorials are also +available[https://github.com/xml4r/libxml-ruby/wiki]. + +All libxml classes are in the LibXML::XML module. The easiest +way to use libxml is to require 'xml'. This will mixin +the LibXML module into the global namespace, allowing you to +write code like this: + + require 'xml' + document = XML::Document.new + +However, when creating an application or library you plan to +redistribute, it is best to not add the LibXML module to the global +namespace, in which case you can either write your code like this: + + require 'libxml' + document = LibXML::XML::Document.new + +Or you can utilize a namespace for your own work and include LibXML into it. +For example: + + require 'libxml' + + module MyApplication + include LibXML + + class MyClass + def some_method + document = XML::Document.new + end + end + end + +For simplicity's sake, the documentation uses the xml module in its examples. + +== Tests + +To run tests you first need to build the shared libary: + + rake compile + +Once you have build the shared libary, you can then run tests using rake: + + rake test + ++Build status: {rdoc-image:https://github.com/xml4r/libxml-ruby/actions/workflows/mri.yml/badge.svg}[https://github.com/xml4r/libxml-ruby/actions/workflows/mri.yml] + +== Performance + +In addition to being feature rich and conformation, the main reason +people use libxml-ruby is for performance. Here are the results +of a couple simple benchmarks recently blogged about on the +Web (you can find them in the benchmark directory of the +libxml distribution). + +From http://depixelate.com/2008/4/23/ruby-xml-parsing-benchmarks + + user system total real + libxml 0.032000 0.000000 0.032000 ( 0.031000) + Hpricot 0.640000 0.031000 0.671000 ( 0.890000) + REXML 1.813000 0.047000 1.860000 ( 2.031000) + +From https://svn.concord.org/svn/projects/trunk/common/ruby/xml_benchmarks/ + + user system total real + libxml 0.641000 0.031000 0.672000 ( 0.672000) + hpricot 5.359000 0.062000 5.421000 ( 5.516000) + rexml 22.859000 0.047000 22.906000 ( 23.203000) + + +== Documentation +Documentation is available via rdoc, and is installed automatically with the +gem. + +libxml-ruby's {online +documentation}[https://xml4r.github.io/libxml-ruby/rdoc/index.html] is generated +using Hanna, which is a development gem dependency. + +Note that older versions of Rdoc, which ship with Ruby 1.8.x, will report +a number of errors. To avoid them, install Rdoc 2.1 or higher. Once you have +installed the gem, you'll have to disable the version of Rdoc that Ruby 1.8.x +includes. An easy way to do that is rename the directory +ruby/lib/ruby/1.8/rdoc to +ruby/lib/ruby/1.8/rdoc_old. + +== Support +If you have any questions about using libxml-ruby, please report an issue +on GitHub[https://github.com/xml4r/libxml-ruby/issues]. + +== Memory Management +libxml-ruby automatically manages memory associated with the +underlying libxml2 library. The bindings create a one-to-one mapping between +Ruby objects and libxml documents and libxml parent nodes (ie, nodes that do not +have a parent and do not belong to a document). In these cases, +the bindings manage the memory. They do this by installing a free +function and storing a back pointer to the Ruby object from the xmlnode +using the _private member on libxml structures. When the Ruby object +goes out of scope, the underlying libxml structure is freed. Libxml +itself then frees all child nodes (recursively). + +For all other nodes (the vast majority), the bindings create temporary +Ruby objects that get freed once they go out of scope. Thus there can be +more than one Ruby object pointing to the same xml node. To mostly hide +this from a programmer on the Ruby side, the #eql? and #== methods are +overriden to check if two Ruby objects wrap the same xmlnode. If they do, +then the methods return true. During the mark phase, each of these temporary +objects marks its owning document, thereby keeping the Ruby document object +alive and thus the xmldoc tree. + +In the sweep phase of the garbage collector, or when a program ends, +there is no order to how Ruby objects are freed. In fact, the Ruby document +object is almost always freed before any Ruby objects that wrap child nodes. +However, this is ok because those Ruby objects do not have a free function +and are no longer in scope (since if they were the document would not be freed). + +== License +See LICENSE for license information.