Module: Ronin::Web

Defined in:
lib/ronin/web/web.rb,
lib/ronin/web/spider.rb,
lib/ronin/web/config.rb,
lib/ronin/web/version.rb,
lib/ronin/web/proxy/app.rb,
lib/ronin/web/mechanize.rb,
lib/ronin/web/proxy/web.rb,
lib/ronin/web/server/web.rb,
lib/ronin/web/server/app.rb,
lib/ronin/web/proxy/base.rb,
lib/ronin/web/user_agents.rb,
lib/ronin/web/server/base.rb,
lib/ronin/web/middleware/rule.rb,
lib/ronin/web/middleware/base.rb,
lib/ronin/web/middleware/proxy.rb,
lib/ronin/web/middleware/files.rb,
lib/ronin/web/middleware/router.rb,
lib/ronin/web/middleware/request.rb,
lib/ronin/web/middleware/helpers.rb,
lib/ronin/web/middleware/response.rb,
lib/ronin/web/middleware/directories.rb,
lib/ronin/web/middleware/proxy_request.rb,
lib/ronin/web/middleware/filters/ip_filter.rb,
lib/ronin/web/middleware/filters/path_filter.rb,
lib/ronin/web/middleware/filters/vhost_filter.rb,
lib/ronin/web/middleware/filters/referer_filter.rb,
lib/ronin/web/middleware/filters/campaign_filter.rb,
lib/ronin/web/middleware/filters/user_agent_filter.rb

Defined Under Namespace

Modules: Config, Middleware, Proxy, Server Classes: Mechanize, Spider, UserAgents

Constant Summary

VERSION =

Ronin Web Version

'0.3.0.rc1'

Class Method Summary (collapse)

Class Method Details

+ (Mechanize) agent(options = {})

A persistant Mechanize Agent.

Returns:

  • (Mechanize)

    The persistant Mechanize Agent.

See Also:



355
356
357
358
359
360
361
# File 'lib/ronin/web/web.rb', line 355

def Web.agent(options={})
  if options.empty?
    @agent ||= Mechanize.new(options)
  else
    @agent = Mechanize.new(options)
  end
end

+ (Nokogiri::HTML::Builder) build_html { ... }

Creates a new Nokogiri::HTML::Builder.

Examples:

Web.build_html do
  html {
    body {
      div(:style => 'display:none;') {
        object(:classid => 'blabla')
      }
    }
  }
end

Yields:

  • [] The block that will be used to construct the HTML document.

Returns:

  • (Nokogiri::HTML::Builder)

    The new HTML builder object.

See Also:



85
86
87
# File 'lib/ronin/web/web.rb', line 85

def Web.build_html(&block)
  Nokogiri::HTML::Builder.new(&block)
end

+ (Nokogiri::XML::Builder) build_xml { ... }

Creates a new Nokogiri::XML::Builder.

Examples:

Web.build_xml do
  post(:id => 2) {
    title { text('some example') }
    body { text('this is one contrived example.') }
  }
end

Yields:

  • [] The block that will be used to construct the XML document.

Returns:

  • (Nokogiri::XML::Builder)

    The new XML builder object.

See Also:



137
138
139
# File 'lib/ronin/web/web.rb', line 137

def Web.build_xml(&block)
  Nokogiri::XML::Builder.new(&block)
end

+ (Mechanize::Page) get(url, options = {}) {|page| ... }

Creates a Mechanize Page for the contents at a given URL.

Examples:

Web.get('http://www.rubyinside.com')
# => Mechanize::Page

Web.get('http://www.rubyinside.com') do |page|
  page.search('div.post/h2/a').each do |title|
    puts title.inner_text
  end
end

Parameters:

  • url (URI::Generic, String)

    The URL to request.

  • options (Hash) (defaults to: {})

    Additional options.

Options Hash (options):

  • :user_agent (String)

    The User-Agent string to use.

  • :user_agent_alias (String)

    The User-Agent Alias to use.

  • :proxy (Network::HTTP::Proxy, Hash) — default: Web.proxy

    Proxy information.

Yields:

  • (page)

    If a block is given, it will be passed the page for the requested URL.

Yield Parameters:

  • page (Mechanize::Page)

    The requested page.

Returns:

  • (Mechanize::Page)

    The requested page.

See Also:



406
407
408
409
410
411
# File 'lib/ronin/web/web.rb', line 406

def Web.get(url,options={})
  page = Web.agent(options).get(url)

  yield page if block_given?
  return page
end

+ (String) get_body(url, options = {}) {|body| ... }

Requests the body of the Mechanize Page created from the response of the given URL.

Examples:

Web.get_body('http://www.rubyinside.com') # => String

Web.get_body('http://www.rubyinside.com') do |body|
  puts body
end

Parameters:

  • url (URI::Generic, String)

    The URL to request.

  • options (Hash) (defaults to: {})

    Additional options.

Options Hash (options):

  • :user_agent (String)

    The User-Agent string to use.

  • :user_agent_alias (String)

    The User-Agent Alias to use.

  • :proxy (Network::HTTP::Proxy, Hash) — default: Web.proxy

    Proxy information.

Yields:

  • (body)

    If a block is given, it will be passed the body of the page.

Yield Parameters:

  • body (String)

    The requested body of the page.

Returns:

  • (String)

    The requested body of the page.

See Also:



453
454
455
456
457
458
# File 'lib/ronin/web/web.rb', line 453

def Web.get_body(url,options={})
  body = Web.get(url,options).body

  yield body if block_given?
  return body
end

+ (Nokogiri::HTML::Document) html(body) {|doc| ... }

Parses the body of a document into a HTML document object.

Parameters:

  • body (String, IO)

    The body of the document to parse.

Yields:

  • (doc)

    If a block is given, it will be passed the newly created document object.

Yield Parameters:

  • doc (Nokogiri::HTML::Document)

    The new HTML document object.

Returns:

  • (Nokogiri::HTML::Document)

    The new HTML document object.

See Also:



54
55
56
57
58
59
# File 'lib/ronin/web/web.rb', line 54

def Web.html(body)
  doc = Nokogiri::HTML(body)

  yield doc if block_given?
  return doc
end

+ (File) open(url, options = {})

Opens a URL as a temporary file.

Examples:

Open a given URL.

Web.open('http://rubyflow.com/')

Open a given URL, using a custom User-Agent alias.

Web.open('http://tenderlovemaking.com/',
  :user_agent_alias => 'Linux Mozilla')

Open a given URL, using a custom User-Agent string.

Web.open('http://www.wired.com/', :user_agent => 'the future')

Parameters:

  • options (Hash) (defaults to: {})

    Additional options.

Options Hash (options):

  • :user_agent (String)

    The User-Agent string to use.

  • :user_agent_alias (String)

    The User-Agent Alias to use.

  • :proxy (Network::HTTP::Proxy, Hash, String) — default: Web.proxy

    Proxy information.

  • :user (String)

    The HTTP Basic Authentication user name.

  • :password (String)

    The HTTP Basic Authentication password.

  • :content_length_proc (Proc)

    A callback which will be passed the content-length of the HTTP response.

  • :progress_proc (Proc)

    A callback which will be passed the size of each fragment, once received from the server.

Returns:

  • (File)

    The contents of the URL.

See Also:



310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
# File 'lib/ronin/web/web.rb', line 310

def Web.open(url,options={})
  user_agent_alias = options.delete(:user_agent_alias)
  proxy = Network::HTTP::Proxy.create(
    options.delete(:proxy) || Web.proxy
  )
  user = options.delete(:user)
  password = options.delete(:password)
  content_length_proc = options.delete(:content_length_proc)
  progress_proc = options.delete(:progress_proc)

  headers = Network::HTTP.headers(options)

  if user_agent_alias
    headers['User-Agent'] = Web.user_agent_aliases[user_agent_alias]
  end

  if proxy[:host]
    headers[:proxy] = proxy.url
  end

  if user
    headers[:http_basic_authentication] = [user, password]
  end

  if content_length_proc
    headers[:content_length_proc] = content_length_proc
  end

  if progress_proc
    headers[:progress_proc] = progress_proc
  end

  return Kernel.open(url,headers)
end

+ (Mechanize::Page) post(url, options = {}) {|page| ... }

Posts to a given URL and creates a Mechanize Page from the response.

Examples:

Web.post('http://www.rubyinside.com')
# => Mechanize::Page

Parameters:

  • url (URI::Generic, String)

    The URL to request.

  • options (Hash) (defaults to: {})

    Additional options.

Options Hash (options):

  • :query (Hash)

    Additional query parameters to post with.

  • :user_agent (String)

    The User-Agent string to use.

  • :user_agent_alias (String)

    The User-Agent Alia to use.

  • :proxy (Network::HTTP::Proxy, Hash) — default: Web.proxy

    Proxy information.

Yields:

  • (page)

    If a block is given, it will be passed the page for the requested URL.

Yield Parameters:

  • page (Mechanize::Page)

    The requested page.

Returns:

  • (Mechanize::Page)

    The requested page.

See Also:



499
500
501
502
503
504
505
506
507
# File 'lib/ronin/web/web.rb', line 499

def Web.post(url,options={})
  query = {}
  query.merge!(options[:query]) if options[:query]

  page = Web.agent(options).post(url,query)

  yield page if block_given?
  return page
end

+ (Mechanize::Page) post_body(url, options = {}) {|body| ... }

Posts to a given URL and returns the body of the Mechanize Page created from the response.

Examples:

Web.post_body('http://www.rubyinside.com')
# => String

Web.post_body('http://www.rubyinside.com') do |body|
  puts body
end

Parameters:

  • url (URI::Generic, String)

    The URL to request.

  • options (Hash) (defaults to: {})

    Additional options.

Options Hash (options):

  • :query (Hash)

    Additional query parameters to post with.

  • :user_agent (String)

    The User-Agent string to use.

  • :user_agent_alias (String)

    The User-Agent Alias to use.

  • :proxy (Network::HTTP::Proxy, Hash) — default: Web.proxy

    Proxy information.

Yields:

  • (body)

    If a block is given, it will be passed the body of the page.

Yield Parameters:

  • page (Mechanize::Page)

    The body of the requested page.

Returns:

  • (Mechanize::Page)

    The body of the requested page.

See Also:



553
554
555
556
557
558
# File 'lib/ronin/web/web.rb', line 553

def Web.post_body(url,options={})
  body = Web.post(url,options).body

  yield body if block_given?
  return body
end

+ (Network::HTTP::Proxy) proxy

Proxy information for Ronin::Web to use.

Returns:

  • (Network::HTTP::Proxy)

    The Ronin Web proxy information.

See Also:



151
152
153
# File 'lib/ronin/web/web.rb', line 151

def Web.proxy
  (@proxy ||= nil) || Network::HTTP.proxy
end

+ (Network::HTTP::Proxy) proxy=(new_proxy)

Sets the proxy used by Ronin::Web.

Parameters:

  • new_proxy (Network::HTTP::Proxy, URI::HTTP, Hash, String)

    The new proxy information to use.

Returns:

  • (Network::HTTP::Proxy)

    The new proxy.

Since:

  • 0.3.0



168
169
170
# File 'lib/ronin/web/web.rb', line 168

def Web.proxy=(new_proxy)
  @proxy = Network::HTTP::Proxy.create(new_proxy)
end

+ (Object) proxy_server(options = {}, &block)

Returns the Ronin Web Proxy. When called for the first time the proxy will be started in the background.



35
36
37
38
39
40
41
42
43
44
# File 'lib/ronin/web/proxy/web.rb', line 35

def Web.proxy_server(options={},&block)
  unless class_variable_defined?('@@ronin_web_proxy')
    @@ronin_web_proxy = Proxy::App
    @@ronin_web_proxy.run!(options.merge(:background => true))
  end

  @@ronin_web_proxy.class_eval(&block)

  return @@ronin_web_proxy
end

+ (Server::App) server(options = {}) {|server| ... }

Returns the Ronin Web Server.

Examples:

Web.server do
  get '/hello' do
    'world'
  end
end

Parameters:

  • options (Hash) (defaults to: {})

    Additional options.

Yields:

  • (server)

    If a block is given, it will be passed the current web server.

Yield Parameters:

Returns:

See Also:

Since:

  • 0.2.0



55
56
57
58
59
60
61
62
63
64
# File 'lib/ronin/web/server/web.rb', line 55

def Web.server(options={},&block)
  unless class_variable_defined?('@@ronin_web_server')
    @@ronin_web_server = Server::App
    @@ronin_web_server.run!(options.merge(:background => true))
  end

  @@ronin_web_server.class_eval(&block)

  return @@ronin_web_server
end

+ (String?) user_agent

The User-Agent string used by Ronin::Web.

Returns:

  • (String, nil)

    The Ronin Web User-Agent

See Also:



211
212
213
# File 'lib/ronin/web/web.rb', line 211

def Web.user_agent
  (@user_agent ||= nil) || Network::HTTP.user_agent
end

+ (String) user_agent=(value)

Sets the User-Agent string used by Ronin::Web.

Parameters:

  • value (String, Symbol, Regexp, nil)

    The User-Agent string to use. Setting user_agent to nil will disable the User-Agent string.

Returns:

  • (String)

    The new User-Agent string.

Raises:

  • (RuntimeError)

    Either no User-Agent group exists with the given Symbol, or no User-Agent string matched the given Regexp.



231
232
233
234
235
236
237
238
239
240
# File 'lib/ronin/web/web.rb', line 231

def Web.user_agent=(value)
  @user_agent = case value
                when String
                  user_agents.fetch(value,value)
                when nil
                  nil
                else
                  user_agents.fetch(value)
                end
end

+ (String) user_agent_alias=(name)

Deprecated.

Will be replaced by calling user_agent= with a Symbol and will be removed in 1.0.0.

Sets the Ronin Web User-Agent.

Parameters:

  • name (String)

    The User-Agent alias to use.

Returns:

  • (String)

    The new User-Agent string.

See Also:



259
260
261
# File 'lib/ronin/web/web.rb', line 259

def Web.user_agent_alias=(name)
  @user_agent = Web.user_agent_aliases[name.to_s]
end

+ (Array) user_agent_aliases

Deprecated.

Will be replaced by user_agents in 1.0.0.

The supported Web User-Agent Aliases.

Returns:

  • (Array)

    The supported Web User-Agent Aliases.

See Also:



197
198
199
# File 'lib/ronin/web/web.rb', line 197

def Web.user_agent_aliases
  Mechanize::AGENT_ALIASES
end

+ (UserAgents) user_agents

A set of common User-Agent strings.

Returns:

Since:

  • 0.3.0



182
183
184
# File 'lib/ronin/web/web.rb', line 182

def Web.user_agents
  @user_agents ||= UserAgents.new
end

+ (Nokogiri::XML::Document) xml(body) {|doc| ... }

Parses the body of a document into a XML document object.

Parameters:

  • body (String, IO)

    The body of the document to parse.

Yields:

  • (doc)

    If a block is given, it will be passed the newly created document object.

Yield Parameters:

Returns:

See Also:



109
110
111
112
113
114
# File 'lib/ronin/web/web.rb', line 109

def Web.xml(body)
  doc = Nokogiri::XML(body)

  yield doc if block_given?
  return doc
end