Archive for the ‘Open Sitemap Generator’ Category.
May 12th, 2007 @ 8:35 by Mike
We’re glad to
announce that
Open Sitemap Generator is now a
db4o Community Project.
You can access its Project Space
here.
Thanks to
German for his support.
Mike
May 7th, 2007 @ 14:23 by Mike
The version 0.6 is out!
In this new version we’re using the great
db4o technology to store all the retrieved URLs.
At the moment it’s only a temporary database that is deleted at the crawling’s end, but this will enable us to add a sitemaps management in a future version.
So the
main new features are: splitting big sitemaps and using an
index file, added an option to
ignore html comments, also an
installer version (this software is still
fully portable).
All the news about this version and a download link are available at the
OSG home page.
Mike
April 11th, 2007 @ 18:27 by Mike
The
sitemaps site was not updated since November 2006, but it has been updated today, as
announced on the Official Google Webmaster Central Blog.
Now the site is available in
18 languages, and
the protocol has been updated to let the webmaster add the
location of the sitemap in the robot.txt file!
Also,
Ask.com is now
supporting the sitemaps protocol.
Nothing new at the moment for the development of our
Open Sitemap Generator, but we look forward for more news in the near future (and maybe our inclusion in the
Google sitemaps third party tools page).
Mike
March 28th, 2007 @ 11:46 by Mike
We’ve released a bugfix version, the 0.5.1.
We’re looking forward to make a big upgrade to the Mapper core in the next major release, using
db4o, with the vision to add a way to make multiple sitemaps with an index, automatically
You can download this version from
here.
Mike
February 1st, 2007 @ 23:03 by Mike
If you need to escape a string to use in a xml file (or stream), you have to escape those entities:
| Character | Escape Code |
| Ampersand | | & | | | & |
| Single Quote | | ‘ | | | ' |
| Double Quote | | “ | | | " |
| Greater Than | | > | | | > |
| Less Than | | < | | | < |
To achieve this result you could use the
SecurityElement.Escape(string str) C# function, but it has a problem.
If your string has some entities already escaped, it escapes them again.
It happens to us testing our
sitemaps generator when it finds URLs on a page that are already escaped.
So we’ve developed this function that tests every & character before to escape it.
public string EscapeXmlString(string URL)
{
//Avoid errors if the string is already escaped for xml use
for (int i = 0; i < URL.Length-1; i++)
{
if (URL[i] == ‘&’)
{
switch (URL[i + 1])
{
case ‘a’:
if ((i + 5 < URL.Length) && (URL.Substring(i, 6) == “'”))
{
continue;
}
else
{
if ((i + 4 < URL.Length) && (URL.Substring(i, 5) == “&”))
{
continue;
}
else
{
//Escape it
URL = URL.Insert(i+1, “amp;”);
}
}
break;
case ‘q’:
if ((i + 5 < URL.Length) && (URL.Substring(i, 6) == “"”))
{
continue;
}
else
{
//Escape it
URL = URL.Insert(i+1, “amp;”);
}
break;
case ‘g’:
if ((i + 3 < URL.Length) && (URL.Substring(i, 4) == “>”))
{
continue;
}
else
{
//Escape it
URL = URL.Insert(i+1, “amp;”);
}
break;
case ‘l’:
if ((i + 3 < URL.Length) && (URL.Substring(i, 4) == “<”))
{
continue;
}
else
{
//Escape it
URL = URL.Insert(i+1, “amp;”);
}
break;
default://Escape it
URL = URL.Insert(i+1, “amp;”);
break;
}
}
}
URL = URL.Replace(“‘”, “'”);
URL = URL.Replace(“\”“, “"”);
URL = URL.Replace(“>”, “>”);
URL = URL.Replace(“<”, “<”);
return URL;
}
Mike