community.roxen.com
Not logged in Date: May 14, 2008
 DEMO  DOCS  PIKE
 COMMUNITY  DOWNLOAD
Home Articles Errors! www.roxen.com

Errors!

Author: Martin Nilsson <nilsson@roxen.com>
Last modified: 2000-09-07 22:48:32


Mistakes and errors are not an unknown thing to people who are used to create. Whether you are a novelist or a programmer you should take errors in account when making plans and schedules, as well as think about how to find and fix problems in what you have written. Being a programmer I'll focus on how to solve RXML problems and leave the 1001 ways to use a dictionary and thesaurus up to other sources.

Different errors

As with everything else in the world you can cathegorize programming errors into several groups. The following are the ones that I have identified and which I will discuss in the following sections:

  • Dictionary errors
  • Syntax errors
  • Quoting errors
  • Logical errors
  • Runtime errors

Martin Nilsson
The author, Martin Nilsson
<nilsson@roxen.com>

Dictionary errors

The first errors that one will make in a new programming language are dictionary errors. Just like when you are beginning to use a new spoken language you are using words that do not exist in the language. E.g. it is not unusual that people used to SSI will try to write



<echo variable="something">

In RXML, dictionary errors are often simple to find and correct. In the example above the tag will simply fall through unparsed and can be found in the HTML source when using the browser's view source function. The solution is of course to have a look in the manual.

Note that even tags that are in the manual can produce the same symptom if their module isn't added to the server. One way to find out which tags are available is to write <help/> on a page. This will bring up the built in online tag documentation when you view the page.

Also wrong or misspelled attributes to tags are considered dictionary errors. These may be a little trickier to find, since most tags do not let unknown attributes "fall through" and get visible in the HTML source.

Sometimes dictionary errors are considered as syntax errors, and often reported as such from computers, but often you have to deal with dictionary errors in a completely different manner from syntax errors.

Syntax errors

Syntax errors have more to do with the form the code must follow. Kind of like interpunctation rules for normal languages. With RXML it is quite easy, follow XML and your code will be syntactically valid RXML. What does that mean? Basically we just add these rules to HTML:

  • The code is case sensitive. Write everything in lower case.
  • All tags must be closed. If you start a tag <x> you have to end it with </x>. You can write this as an atomic <x/>. If you want to write your HTML code XML compliant as well, remeber that all browsers do not react well to e.g. <br/>, so use <br /> instead.
  • All arguments have to be quoted, e.g. <imgs src="pic.jpg" alt='He says "Booo!"' />.
  • All attributes have to be nonempty, e.g. <hr noshadow="noshadow" />.

There is however nothing that says that the code will break because you don't follow all these rules. RXML was made to be convenient, so you are, for instance, able to provide empty attribute values. It is really only the two first rules that get people into trouble. The first one will cause a dictionary error, since it will use the wrong name for things. If the second rule is broken one can expect parts of the page to simply disappear in the resulting page, since the parser doesn't find a matching end tag. The RXML parser in Roxen WebServer 2.1 tries to detect when an empty tag is used as a nonempty tag and reports an error.

Quoting errors

Quoting errors are really part of syntax errors, but I have given it a section of its own since getting quoting right is often tricky in XML environments, especially with HTTP strings and SQL strings present in it, which have different ways of quoting.

By default as much as possible is quoted in RXML to avoid security problems. E.g.



<define variable="var.hi">
<b>Hi!</b>
</define>
&var.hi;

will yield &lt;b&gt;Hi!&lt;/b&gt; and not <b>Hi!</b> as one might have expected. A working solution in this case is to use &var.hi:none; instead. When you are using an entity in another context you are however currently unprotected, and have to do an explicit quote.


<sqlquery query="INSERT INTO user VALUES ('', '&form.name:mysql')" />

Other tricks to have the parsing your way is to use the <eval>-tag and the <noparse>-tag, which are both described in the manual. As the last trick up in the sleeve you can use the noparse processing instruction container, available in Roxen WebServer 2.1. Everything inside <?noparse ... ?> will not be handled by the parser at all.



<?noparse
This is <roxen/> &roxen.version;
?>

Logical errors

This is possibly the kind of errors that you can be stuck with the longest, if you discover them at all. A logical error appears where you made an mistake in the way you thought the program should work. An example where someone tries to distinguish customers with less than hundred sessions from customers with more than hundred sessions.



<if variable="user.sessions < 100">
  <!-- Ordinary customer -->
  <redirect to="/customers/"/>
</if><if variable="user.sessions > 100">
  <!-- Very faithful customer -->
  <redirect to="/very_valued_customer/"/>
</if>

Now, this solution is bad from several points of view, but the logical mistake is what happens when a user goes from 99 sessions to 100 sessions. The user gets an empty page.

Debugging these kinds of errors is often tricky, especially since they have a tendency to occur seemingly inconsistently, in this case only once after a hundred sessions, and never again. The common thing to do when debugging this kind of errors is to add debug code. That is, code that writes out the content of critical variables as well as telling which branch in a conditional statement is executed. This is where you can tell who is an experienced programmer and who is not.

Sometimes it might be a bad idea to write the debug information into the actual HTML pages, perhaps because of redirects or because you are debugging a live site. Then the debug log, found in logs/debug/default.1, is a good place to output such information. From pike you can use the report_debug or werror function, which works as a sprintf that outputs its information into the debug log. Note that report_debug is only defined in Roxen WebServer, whereas werror is always available in Pike.



werror("This is the first line of the program\n");
mapping a=([ "a":23, "b":45 ]);
werror("a is %O\n", a);

From RXML pages you can use the debug tag.


<debug werror="This is printed in the debug log"/>

Runtime errors

This is the kind of error that the web surfer most often sees before you do and posts funny remarks in webboards and IRC about the quality of your server and your ability to manage complex systems. Examples of runtime errors are when subsystems that the page depends on, such as databases and CGI-scripts, are down or broken, which yields funny or unexpected results such as backtraces. It can also be e.g. an SQL query that generates a syntax error in the SQL database because of a quoting error in the generated RXML code that is trigged by a users strange or intentionally evil input.

Quick summary: Runtime errors are errors that appear at runtime due to failing systems, inadequate program security/integrity (quote errors or other lack of variable value verification) or intentional cracking attempts. They are not really part of the rest of my error classification, but rather an overlapping definition. One of the problems I had when writing RXML tags was to decide when to give a parse (syntax+dictionary) error and when to give a runtime error.

For reasons stated above it is often desired to have some sort of error handling that is polite and uninformative, as opposed to backtraces and RXML errors which are "cold" and detailed. It is even possible that the backtrace is giving away information that you don't want others to see. (Powertip to the pike module developer: if you alter the values in the attibute mapping, these are values that will be printed out in the backtrace. Hence if you replace all the passwords and other sensitive information in the attribute mapping with strings like "XXXXXX", that string is what will be shown in the backtrace. This operation is preferably done at the start of your function and after you have read the values into local variables...) Hence, if you try to do an emit with an SQL source and an invalid query you will find that you do not get an error written out on the page (although you will find one in the debug log). This is because the default behavior of RXML is to try to cover up runtime errors. You can disable this in the settings of the RXML Parser module.

Now showing an empty page when an SQL query has failed is obviously not a good thing, whether you are developing the page or if it is a live runtime error. The default behavior of the RXML parser is to alter the truth-value on the page when an error has occured. Hence you can use constructions with else, elseif and then to create the desired actions.



<emit source="sql" query="SELECT * FROM products WHERE ">
<b>&_.name;</b> <p>&_.desc;</p>
</emit>
<else>
<h3>An SQL error has occured. Bummer.</h3>
</else>

The other possibility is of course that you do want to see the error, and possibly even more debug information. You can control this with the debug tag. <debug/> will set id->misc->debug to 1, which is interpreted by several tags.

I Demand Flexibility!

The behavior of RXML-errors can be controlled from the RXML Parser modules settings page, the "RXML Errors" tab. There you can select which errors should show up in the debug log and which should show up on the HTML pages. The default setting is that parse errors (dictionary and syntax errors) don't show up in the log but on screen and that runtime errors show up in the log but not on screen. The equivalent of <false> is always inserted when an error occurs.

Tags that generate graphic text, such as gbutton and gtext, tend to generate big images when an error occurs in their input and they get a backtrace. To prevent this, these tags has the flag DONT_REPORT_ERRORS set in them, which will cause the "parent" tag to report it instead.

In the proud tradition of turning everything into a plugin system, it is also almost possible to create your own error handling system. The RXML Parser, for which the RXML Parser module is only a wrapper, looks for a module that provides "RXMLRunError" and "RXMLParseError" and calles the functions rxml_run_error and rxml_parse_error when any of these events occur. Take a look at these short functions in the file server/modules/tags/rxmlparse.pike to see how things work.