Skip to content

Commit

Permalink
Fix some encoding and title issues [[bugzilla:22020]], [[bugzilla:203…
Browse files Browse the repository at this point in the history
…11]], [[bugzilla:21976]]

* Create greater seperation between display, pagename and queryname
article.title == pagename or search param (queryname)
article.display_title == The human readable title (iPod instead of IPod)
article.page_name == wgPageName

* CGI escape should not be used for URIs. http://blog.tquadrado.com/?p=172
- Use URI::escape/unescape (in javascript encodeURI)
- Use modified URI::escape/unescape (in javascript encodeURIComponent)

* url_to_html() Should be used to print links to raw html (%a{ :href => } does this automatically !!!! )

* I retained some old stuff. This could possibly be removed at one time.
- current_name (used by non-articles)
- escaped_title and uri_escaped_title (used as params to download from wikipedia proper)
  • Loading branch information
hartman committed Jan 6, 2010
1 parent 7940804 commit 55b503f
Show file tree
Hide file tree
Showing 11 changed files with 42 additions and 39 deletions.
4 changes: 2 additions & 2 deletions app/controllers/articles.rb
Expand Up @@ -128,9 +128,9 @@ def cache_block(&block)
end

protected

# This is URI encoded.
def current_name
@name ||= (params[:search] || params[:title] || params[:file] || "").gsub("_", " ")
@name ||= (params[:search] || params[:title] || params[:file] || "")
end

def cache_key
Expand Down
18 changes: 8 additions & 10 deletions app/helpers/global_helpers.rb
Expand Up @@ -7,7 +7,7 @@ module GlobalHelpers
# The real rule here, is that if its a home page, leave it empty
def search_bar_contents
if current_wiki['mobile_main_page'] != current_name
CGI::unescape(current_name).force_encoding("UTF-8")
URI::unescape(current_name).force_encoding("UTF-8").gsub( "_", " ")
else
""
end
Expand Down Expand Up @@ -36,36 +36,34 @@ def button_to(text, to, id = nil)
%|<form method="get" action="#{to}"><button type="submit" id="#{id}Button">#{text}</button></form>|
end

# shortcuts for URLs
# The path parameters should be valid wiki wgPageName(s)
def path_site
%|http://#{request.language_code}.wikipedia.org|
end

def path_encoded(path)
CGI::escape(path)
end

def redirect_url
%|#{path_site}/w/mobileRedirect.php|
end

def temp_url(path)
%|#{redirect_url}?to=#{path_site}/wiki/#{path_encoded(path)}|
%|#{redirect_url}?to=#{encode_query_component(path_site + "/wiki/" + path)}|
end

def disable_url(path)
%|/disable/#{path_encoded(path)}|
%|/disable/#{path}|
end

def perm_url(path)
%|#{temp_url(path)}&amp;expires_in_days=#{365 * 10}|
%|#{temp_url(path)}&expires_in_days=#{365 * 10}|
end

def action_url(path,action)
%|#{path_site}/w/index.php?title=#{path_encoded(path)}&amp;action=#{action}&amp;useskin=chick|
%|#{path_site}/w/index.php?title=#{encode_query_component(path)}&action=#{action}&useskin=chick|
end

def stop_redirect_notice(path)
%|<a href="#{temp_url(path)}">#{language_object["regular_wikipedia"]}</a>
%|<a href="#{temp_url(path)}">#{language_object["regular_wikipedia"]}</a>
<div id="perm">
<a href="#{disable_url(path)}">#{language_object["perm_stop_redirect"]}</a>
</div>|
Expand Down
2 changes: 1 addition & 1 deletion app/models/article.rb
Expand Up @@ -83,7 +83,7 @@ def html

def fetch!(*paths)
if !paths.any?
paths = (@paths ||= ["/wiki/#{escaped_title}", "/wiki/Special:Search?search=#{uri_escaped_title}"])
paths = (@paths ||= ["/wiki/#{title}", "/wiki/Special:Search?search=#{uri_escaped_title}"])
end
super(*paths)
end
Expand Down
4 changes: 2 additions & 2 deletions app/models/parsers/wml.rb
Expand Up @@ -5,10 +5,10 @@ def self.parse(article, options = {})
html= super(article, options) # Do everything xhtml does like rough cutting of content, setting of page title
page = Nokogiri::HTML(html)
idx= 0
toc= "<card id='toc' title='#{article.title}'>"
toc= "<card id='toc' title='#{article.display_title}'>"
result= ""
block=[]
block_title= article.title
block_title= article.display_title
page.xpath("//h2|//p").each do |elem|
case elem.name
when "p"
Expand Down
4 changes: 0 additions & 4 deletions app/models/parsers/xhtml.rb
Expand Up @@ -47,10 +47,6 @@ def self.parse(article, options = {})
# Remove all of the medialists
doc.css(".medialist").each { |m| m.parent.remove }

# For getting the human-readable title of the page
# grab what's in the .first-heading div
article.title = doc.css(".firstHeading").first.inner_html

# Ah, hot and fresh html from the parser
html = doc.to_xhtml

Expand Down
31 changes: 20 additions & 11 deletions app/models/wikipedia_resource.rb
Expand Up @@ -3,8 +3,12 @@
module Wikipedia
# A Wikipedia resource is the base type for all resources
class Resource
# String:title The human readable version of the title
# String:title The requested title (normally URI encoded)
attr :title, true
# String:display_title The human readable version of the title
attr :display_title, true
# String:page_name The canonical pagename of the article
attr :page_name, true

# String:path if path is set, then we are specifying where fetching happens to the server object
attr :path, true
Expand Down Expand Up @@ -33,6 +37,8 @@ def initialize(server_or_language, title = nil, path = nil, device = nil)
@server = server_or_language.kind_of?(Server) ? server_or_language : Server.new(server_or_language)
@title, @path, @device = title, path, device
@loaded = false
@display_title = "ERROR: blank title"
@page_name = "ERROR: blank page_name"
end

def loaded?
Expand All @@ -50,26 +56,29 @@ def fetch!(*paths)
end
@raw_html = result[:body]
@raw_document = Nokogiri::XML(@raw_html)
self.title ||= raw_document.css("title").first.inner_html.gsub(" - Wikipedia, the free encyclopedia", "")

# For getting the human-readable title of the page
# grab what's in the .first-heading div
# Test with iPod and M<sup>+</sup>_Fonts
self.display_title = raw_document.css(".firstHeading").first.text

vars = raw_document.css('script:contains("wgPageName")').first.text
self.page_name = /^wgPageName=\"(.*)\"/.match( vars )[1]
self.loaded = true
end

def encoded_title
CGI::unescape(@title)
end

private
# Used internally to get the escaped title
# For use by fetch in app/models/article.rb
# :api: public
def escaped_title
return "" if title.nil?
@escaped_title ||= title.strip.gsub(" ", "_")
end




def uri_escaped_title
@uri_escaped_title ||= URI::escape(escaped_title)
@unescaped_title ||= URI::unescape(escaped_title)
@uri_escaped_title ||= URI::escape(@unescaped_title, Regexp.new("[^#{URI::PATTERN::UNRESERVED}]"))
end
end
end
end
2 changes: 1 addition & 1 deletion app/views/information/disable.html.haml
Expand Up @@ -5,7 +5,7 @@

#disableButtons
%form{:action => redirect_url, :method => "get"}
%input{:type => "hidden", :name => "to", :value => "#{path_site}/wiki/#{path_encoded(@path)}"}
%input{:type => "hidden", :name => "to", :value => encode_query_component(path_site + "/wiki/" + URI::decode(@path))}
%input{:type => "hidden", :name => "expires_in_days", :value => "#{365 * 10}"}
%button#disableButton{:type => "submit"}= language_object['disable_button']

Expand Down
2 changes: 1 addition & 1 deletion app/views/layout/_footmenu_default.html.haml
@@ -1,6 +1,6 @@
#footmenu.nav
.notice
- if @article
= stop_redirect_notice(@article.title)
= stop_redirect_notice(@article.page_name)
- else
= stop_redirect_notice("")
8 changes: 4 additions & 4 deletions app/views/layout/_footmenu_simple.html.haml
Expand Up @@ -13,22 +13,22 @@
%br>
%span.idx 3
- if @article
%a{:href => temp_url(@article.title), :accesskey => "3"}= language_object["regular_wikipedia"]
%a{:href => temp_url(@article.page_name), :accesskey => "3"}= language_object["regular_wikipedia"]
- else
%a{:href => temp_url(""), :accesskey => "3"}= language_object["regular_wikipedia"]
-# 4 is Talk, 6 Hist
- if @article
%br>
%span.idx 5
%a{:href => action_url(@article.title, "edit"), :accesskey => "5"}= language_object["nav_edit"]
%a{:href => action_url(@article.page_name, "edit"), :accesskey => "5"}= language_object["nav_edit"]
%br>
%span.idx 6
%a{:href => action_url(@article.title, "history"), :accesskey => "6"}= language_object["nav_history"]
%a{:href => action_url(@article.page_name, "history"), :accesskey => "6"}= language_object["nav_history"]
%br>
%span.idx 8
%a{:href => "#footmenu", :accesskey => "8"}= language_object["nav_end"]
%br>
- if @article
%a{:href => disable_url(@article.title)}= language_object["perm_stop_redirect"]
%a{:href => disable_url(@article.page_name)}= language_object["perm_stop_redirect"]
- else
%a{:href => disable_url("")}= language_object["perm_stop_redirect"]
4 changes: 2 additions & 2 deletions app/views/layout/_javascript.html.haml
@@ -1,12 +1,12 @@

- if @article
:javascript
var title = "#{h @article.title.gsub("_", " ")}";
var title = "#{h @article.page_name}";
var server = "#{@article.server.base_url}";
function shouldCache() {
return true;
}
- elsif current_name != ""
:javascript
var title = "#{@name}";
var server = "#{current_server.base_url}";
var server = "#{current_server.base_url}";
2 changes: 1 addition & 1 deletion app/views/layout/application.html.haml
Expand Up @@ -2,7 +2,7 @@
!!!
%html{:xmlns => "http://www.w3.org/1999/xhtml", 'xml:lang' => "en-us", :lang => "en-us"}
%head
%title= @article ? @article.encoded_title : "Wikipedia"
%title= @article ? @article.display_title : "Wikipedia"
<meta http-equiv="content-type" content="text/html; charset=utf-8" />
%link{:type => "text/css", :href => "/stylesheets/#{request.device.css_file_name}.css", :rel => "Stylesheet", :media => "all"}/
<meta name="ROBOTS" content="NOINDEX, NOFOLLOW" />
Expand Down

0 comments on commit 55b503f

Please sign in to comment.