tag:blogger.com,1999:blog-88120125448370636272024-03-14T03:54:50.277+01:00stufffun stuff yay!Vehttp://www.blogger.com/profile/10620632095701240843noreply@blogger.comBlogger12125tag:blogger.com,1999:blog-8812012544837063627.post-46283758190819392462008-03-03T12:38:00.000+01:002011-03-25T08:50:17.739+01:00XCS Syntax<div>Extended Cascading Stylesheets (XCS) is a facilitating extension to standard CSS language. Since it is handled by a preprocessor <i>before</i> it is handed out to a CSS-fluent client (in a way similar to, say, how C sources are handled), it is useful to think of it as a macro language, rather then as of an actual extension of the standard. As noted in <a title="the list" href="http://malatestapunk-stuff.blogspot.com/2008/02/xcs-extended-cascading-stylesheets.html#3.1%20The%20List" id="c8sm">The List</a>, new keywords and semantics are introduced, and I tried hard to keep the syntax as close to the spirit of CSS as possible.</div> <h4>A new <code>@</code> rule - <code>@require</code> keyword<a id="wlmg" name="at-require"></a></h4> <p>Much as in PHP, the new keyword includes the referenced file <i>inline</i>, during the preprocessing. A parenthesized string representing a valid path is required either immediately after the keyword, or preceded with any number of whitespace characters. Example:</p> <pre><code>@require(../etc/defines.css);</code></pre> <p>The parenthesized string is examined as a PHP path string - so, slashes will work in Windows environment as well.</p> <h4>Single-line comments<a id="o5-o" name="single-line-comments"></a></h4> <p>When the string <code>//</code> is encountered anywhere within an stylesheet, the rest of the line, up to ending newline character will be considered an XCS comment, and will be converted to standard CSS <code>/* ... */</code> comment in place.</p> <h4>Constants<a id="weg2" name="constants"></a></h4> <p>XCS understands the concept of constants rather then variables, declared through assigning. Once declared, the values can be cascaded overshadowing each other. However, the actual value is expanded in the stylesheet only after the cascading is done with. Variable names must be preceded by a variable-indicator string or character (<code>!</code> by default, but that can be changed) in order to be identified by the preprocessor. The value is always in global scope. The assigning is done by tying a variable name to a value:</p> <pre><code>!green = #0000FF;
!heavy = 4em;
!light = 0.5em;
!green = #FF0000;</code></pre> <p>It is important to note that all occurrences of <code>!green</code> in the rules within a stylesheet will be replaced with the string <code>#FF0000</code> (red), because of value cascading. The assignment must be done within a single line, and the ending semicolon must not be omitted.</p> <h4>Simple math<a id="x2p_" name="simple-math"></a></h4> <p>Simple math calculations involving either two declared variables or a variable and a CSS value (or a numeric constant) can be conducted either in variable assignment, or in a rule itself. These simple expressions automatically inherit the CSS unit from the second operand (or from the first, in case the second operand is a numeric constant). Examples:</p> <pre><code>!red = #ff0000;
!green = #0000FF;
!border = !heavy-!light solid !green;
!heavy = 4em;
!light = 0.5em;
body{
color: !red-!green; // Colors can be calculated as well
border: !border;
bla: !heavy/2;
}</code></pre> <h4><code>expr</code> Expressions<a id="jigi" name="expr-expressions"></a></h4> <p>A more complex form of value expansion is the <code>expr</code> keyword, denoting a required parenthesized expression' value. The expression can contain any number of XCS constants, and even PHP variables. <b>Note:</b> since that is a security disaster just begging to happen in an uncontrolled environment, the expression evaluation can be entirely disabled. The <code>expr</code> keyword must always be followed by a parenthesized expression, either immediately, or after any number of whitespace characters. Examples:</p> <pre><code>!const = 12;
!size = expr((!const * 2)/24)em; // expands to '1'
!date = expr('10 Dec 2007');
!server = expr($_SERVER['SCRIPT_FILENAME']);
!newSize = expr(!const*22)px; // Can NOT use `expr` values in expr</code></pre> <p>It is important to note that the values assigned by expression expanding can <i>not</i> be used within expressions recursively, as seen in the last line of the example. The expanded values can be either strings or numerics, whatever, the preprocessor doesn't care, as long as they can be expanded to a finite value.</p> <h4>Rule inheritance<a id="umr_" name="rule-inheritance"></a></h4> <p>Just like CSS understands <i>element</i> inheritance, XCS understands <i>rule</i> inheritance as well. Rule inheritance can be triggered by using an <code>extends</code> keyword in the selector part of the rule. The <code>extends</code> must be followed by a parenthesized existing rule selector string, either immediately, or after any number of whitespace characters. The rule indicated by the parenthesized selector string is then prepended to the current rule, thus allowing any cascading effect to take place. Example:</p> <pre><code>p.emphasis {
font-style: italic;
}
p.strong-emphasis extends (p.emphasis) {
font-weight: bold;
}</code></pre> <p>All HTML <code>p</code> elements with class <code>strong-emphasis</code> will be rendered in both bold and italic font.</p> <h4>Defined preprocessor behavior in erroneous situations<a id="nu5g" name="defined-behvior"></a></h4> <p>The preprocessor acts as suggested by the CSS standard specification for compliant engines: as long as a stylesheet is <i>syntactically</i> correct, the preprocessor will attempt to understand it's <i>semantics</i>, leaving as-is anything it can't understand, or can't understand fully. So, as long as everything <i>looks</i> OK to the parser, the preprocessor won't barf at you - however, there might be, as in CSS, semantic errors that sneaked in.</p>Vehttp://www.blogger.com/profile/10620632095701240843noreply@blogger.com2tag:blogger.com,1999:blog-8812012544837063627.post-51589904750443429852008-02-29T13:19:00.001+01:002011-03-25T08:50:25.592+01:00Extended Cascading Stylesheets - the beginning<div><span class="Apple-style-span" style="font-weight: bold; ">1 What?</span></div><h4><a class="ItemAnchor" name="1 What?" title="1 What?"></a></h4><a href="http://www.phpclasses.org/browse/package/4351.html" target="" title="XCS on phpclasses.org">Extended Cascading Stylesheets</a> (from now on, <a href="http://www.phpclasses.org/browse/package/4351.html" target="_blank" title="XCS on phpclasses.org">XCS</a>) is an attempt at implementing some development-facilitating features to standard Cascading Stylesheets language (CSS) while avoiding the unnecessary language pollution by introducing an intermediate parser, thus keeping the output standards-compliant.
<h5> 1.1 <em>What?!?</em><a class="ItemAnchor" name="1.1 What?!?" title="1.1 What?!?"></a></h5> In more simple terms, I tried to make my own time spent with CSS more pleasant, by relying on an intermediate layer (parser) instead on the quirks of the standard.
<h4> 2 Why?<a class="ItemAnchor" name="2 Why?" title="2 Why?"></a></h4> While working with CSS, I found a lot of annoying stuff - and I'm not talking about browser quirks here, but the language itself. For instance, it has always been annoying for me that there is no single-line comment in CSS (an equivalent to, say, <span style="font-family: monospace;"><code> // </code> or <code> # </code></span> in some other languages) - thus, no quick'n'easy way to kick an entire line out of your current sheet. I realize this is quite individual and really <span style="font-weight: bold;">not</span> all that important - however, I'm just marking an example of a trivial feature that made my life suck a bit more.
<h5> 2.1 Origins<a class="ItemAnchor" name="2.1 Origins" title="2.1 Origins"></a></h5><p> While working on some changes for a friend's site, I noticed I was, essentially, doing the same thing all over again - the layout type of the site was the same (3 columns - 2 fixed width, one liquid), typography was the same, only the color scheme and the fixed column width changed. Most of the CSS changes I did involved similar actions - the base was sound, only minor changes should be implemented. These changes, however, included lots of line hunting, either manually or facilitated by <code> sed </code> or a similar tool in the current editing environment. </p><h6>2.1.1 Zeitgeist<a class="ItemAnchor" name="2.1.1 Zeitgeist" title="2.1.1 Zeitgeist"></a></h6><p> At about the same time, <a href="http://www.yaml.de/en/" target="_blank" title="YAML">CSS</a> <a href="http://code.google.com/p/blueprintcss/">frameworks</a> <a href="http://www.google.com/search?q=css%20framework" target="_blank" title="">became all the rage</a>, with <a href="http://www.b-list.org/weblog/2007/nov/19/frameworks/" target="_blank" title="">all their</a> <a href="http://www.the-haystack.com/2007/08/11/semantic-markup-and-css-frameworks/">pros and cons</a> <a href="http://jeffcroft.com/blog/2007/nov/17/whats-not-love-about-css-frameworks/" target="_blank" title="">baggage</a>. <a href="http://builder.yaml.de/" target="_blank" title="">Online tools</a> for making layouts based on this or that framework were came to life. New <a href="http://www.blogger.com/">Blogger</a> templates (<a href="http://help.blogger.com/bin/answer.py?answer=43708">layouts</a>) with CSS constants support were already introduced, allowing inexperienced users to easily change some aspects of a chosen template. It all seemed to be interconnected somehow, in an effort to make the users/developers lives more pleasant. So, when reading <a href="http://nubyonrails.com/articles/dynamic-css" target="_blank" title="Dynamic CSS">this article</a>...</p><p></p><h6>2.1.2 It all clicked together<a class="ItemAnchor" name="2.1.2 It all clicked together" title="2.1.2 It all clicked together"></a></h6><p>It really did. It offered the idea of a DIY tool that could relieve me of all the trivial (and not-so-trivial) aches I faced with CSS - one that would allow me to express myself more easily, while using the familiar CSS syntax, sugar-coated for easier swallowing on my side, plain old for browsers to consume.</p><h4>3 How?<a class="ItemAnchor" name="3 How?" title="3 How?"></a></h4>The first thing I decided to leave out of the original concept is the implementation language. A PHP class was the way to go for me, because a) I don't necessarily always have a Ruby interpreter around and, more importantly, b) a PHP class would lend itself well to making a plugin for embedding into existing CMS solutions.
The concept of CSS constants is a definite keeper - this alone would save a whole lot of time normally spent on search-and-replace. Some other concepts were excellent but some didn't really seemed all that important, and some features I considered handy were not there at all. So I sat down and make myself...
<h5>3.1 The List<a class="ItemAnchor" name="3.1 The List" title="3.1 The List"></a></h5>The List of the stuff I wanted to have:
<ul><li>CSS variables</li><li>Extended Math expression support for both colors and measurements - perhaps even strings</li><li><code>require</code>-a-like keyword, for inline, compile-time inclusions</li><li>Single-line comments</li><li>CSS rule inheritance by extending (as opposed to regular CSS <span style="font-style: italic;">element</span> inheritance by cascading)</li><li>Easily customizable syntax for the newly introduced features</li><li>Various levels of pretty-printing of the resulting CSS, coupled with some basic compression</li></ul><h5>3.2 Implementation plan<a class="ItemAnchor" name="3.2 Implementation plan" title="3.2 Implementation plan"></a></h5>Once settled down on the features I'd like to have, I considered my options for concrete implementation. I was told that <a href="http://www.php.net/index.php#2007-07-13-1">PHP4 was dead</a> and that the sooner we all start doing stuff with PHP5, the sooner everybody else will follow. So PHP5 then it is. Of course, this severely cuts down on the number of CM systems one could embed this in <span style="font-style: italic;">right now -</span> however, this will change very soon, as hosting services start adopting PHP5 because of lacking PHP4 support. And PHP5 is more fun, anyway.
<h5>3.3 The result<a class="ItemAnchor" name="3.3 The result" title="3.3 The result"></a></h5><a href="http://www.phpclasses.org/browse/package/4351.html">Take a look at the result</a> at <a href="http://www.phpclasses.org/" target="_blank" title="">phpclasses.org</a>.Vehttp://www.blogger.com/profile/10620632095701240843noreply@blogger.com0tag:blogger.com,1999:blog-8812012544837063627.post-73299746054795538202008-02-27T11:47:00.001+01:002008-02-27T12:00:31.992+01:00Working with GtkScintilla<p>
<a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://bp0.blogger.com/_4Vwug7lU7r0/R8VA6geDnTI/AAAAAAAAABI/qbWvflnzEr8/s1600-h/window.jpg"><img style="float:right; margin:0 0 10px 10px;cursor:pointer; cursor:hand;" src="http://bp0.blogger.com/_4Vwug7lU7r0/R8VA6geDnTI/AAAAAAAAABI/qbWvflnzEr8/s320/window.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5171611120997473586" /></a>
First thing to note, I was working with <abbr title='PHP Hypertext Preprocessor'>PHP</abbr>-GTK2 (php-gtk-2.0.0 beta) and, naturally, was following the reference on that (<a href='http://gtk.php.net/manual/en/reference.php'><abbr title='PHP Hypertext Preprocessor'>PHP</abbr>-GTK2 reference</a>). However, <a href='http://gtk.php.net/manual/en/gtk.gtkscintilla.php'>GtkScintilla</a> section in <abbr title='PHP Hypertext Preprocessor'>PHP</abbr>-GTK2 reference really is <em>not</em> all that informative. Since the <code>GtkScintilla</code> <abbr title='Application Programming Interface'>API</abbr> undergone very little changes from the last version (as far as I can tell), you may be better off using <a href='http://gtk.php.net/manual1/en/scn.gtkscintilla.php'>the older reference for GtkScintilla</a>.</p>
<p>Another thing to note, some stuff is missing from the older reference as well - most notably, descriptions (and even names) of nearly all defines are missing, as well as for some of the methods - those related to search functionality provided by <code>GtkScintilla</code> class, for an example. For instance, this is what the reference for <code>set_search_flags</code> method looks like:</p>
<p><strong><abbr title='PHP Hypertext Preprocessor'>PHP</abbr>-GTK:</strong> <code>void set_search_flags(int flags);</code><br/>
<strong><abbr title='PHP Hypertext Preprocessor'>PHP</abbr>-GTK2:</strong> <code>void set_search_flags(flags);</code></p>
<p>OK. Even a somewhat more informative reference didn't take me very far. So, the flags are <code>int</code>. Great. Now what? And how do I do the search, anyways?</p>
<h4>Use the source</h4>
<p>Well, as it turned out, lacking documentation for the constants wasn't such a problem, because none of search-related constants are defined anyway. So, what's a man to do, except to dive into the source code?</p>
<p>After some greping through the <code>GtkScintilla</code> sources, I've found these values in <code>ext\scintilla\libscintilla\include\Scintilla.h</code>:</p>
<pre><code>#define SCFIND_WHOLEWORD 2
#define SCFIND_MATCHCASE 4
#define SCFIND_WORDSTART 0x00100000
#define SCFIND_REGEXP 0x00200000
</code></pre>
<p>And so that's what I used for my search flags:</p>
<pre><code>@define ("SCINTILLA_FIND_DOWN", 1);
@define ("SCINTILLA_FIND_WHOLE_WORDS", 2);
@define ("SCINTILLA_FIND_MATCH_CASE", 4);
@define ("SCINTILLA_FIND_WORD_START", 0x00100000);
@define ("SCINTILLA_FIND_REGEXP", 0x00200000);
</code></pre>
<p>The <code>@</code>s in front of each <code>define</code> are a future safe short-circuit error guard, in case a particular define actually exist. I <em>could</em> surround each statement with <code>if</code>s, yeah, but this seems like a much nicer way of expressing the same thing - if a define exists, an error occurs and the new <code>define</code> statement is not executed. The error gets suppressed thanks to <code>@</code>, and we're on our merry way. Well, that covered the search-related defines, but I still didn't know how to actually conduct the search.</p>
<h4>Putting it to (good) use</h4>
<p>After some trial-and error, I've managed to poke my way through. So, in a nutshell:</p>
<pre><code>// $current is a GtkScintilla instance
if (!$firstTime) {
$pos = $current->get_selection_end();
$current->set_selection_start($pos);
$current->goto_pos($pos);
}
$current->search_anchor();
// Search forward
$result = $current->search_next($searchFlags, $searchTerm);
// Search backwards
// $result = $current->search_prev($searchFlags, $searchTerm);
if ($result > 0) echo("Match found at $result");
</code></pre>
<p>
<a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://bp0.blogger.com/_4Vwug7lU7r0/R8VBGgeDnUI/AAAAAAAAABQ/DRoA4dFaWrg/s1600-h/far.jpg"><img style="float:left; margin:0 10px 10px 0;cursor:pointer; cursor:hand;" src="http://bp0.blogger.com/_4Vwug7lU7r0/R8VBGgeDnUI/AAAAAAAAABQ/DRoA4dFaWrg/s320/far.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5171611327155903810" /></a>
The <code>if</code> clause determines if we had a previous match and if so, resets the selection and sets the current cursor position at the end of it. Then we anchor the start of the search to the current cursor position by calling <code>GtkSintilla</code>s <code>search_anchor</code> method. Then we can actually perform the search, by calling either <code>search_next</code> (search forward) or <code>search_prev</code> (search backwards) methods and passing the appropriate arguments (an integer bitmask <code>$searchFlags</code> and a string to look for, <code>$searchTerm</code>).</p>
<p>Of course, you should have something like this <em>above</em> the code I just introduced:</p>
<pre><code>$firstTime = true; // or false
$searchTerm = "blah"; // whatever your search term is
$searchFlags = SCINTILLA_FIND_MATCH_CASE & SCINTILLA_FIND_WHOLE_WORDS; // or whatever
</code></pre>
<p>If successful, both methods set the selection around the search term for you and return the position in the text where the match occured. If the return value is 0, the match wasn't found.</p>Vehttp://www.blogger.com/profile/10620632095701240843noreply@blogger.com0tag:blogger.com,1999:blog-8812012544837063627.post-76415629713242086452008-01-21T08:35:00.001+01:002008-02-27T11:55:53.665+01:00PHP-GTK experience<p>Recently, I've been extensively involved in desktop application development using <a href='http://gtk.php.net/'><abbr title='PHP Hypertext Preprocessor'>PHP</abbr>-GTK</a>.</p> <img style="float:right" src="http://upload.wikimedia.org/wikipedia/en/8/85/Php-gtk.png" alt="PHP-GTK logo" />
<p>At first, I just needed a cross-platform, <a href='http://daringfireball.net/projects/markdown/'>Markdown</a>/<a href='http://texy.info/en/'>Texy</a>-capable desktop blogging tool that would publish to (and open/batch backup from) <a href='http://www.blogger.com'>Blogger</a>, preferably <a href='http://scintilla.sourceforge.net/'>Scintilla</a>-based. I looked around quite a bit and didn't like what I found, so I decided to do myself a favor and just sit down and make it already. Then I weighed some choices on how to actually do it:</p>
<h4>First Choice - <em>HOW DO I DO IT?</em></h4>
<dl>
<dt>Quickly, hack together a bunch of batch files</dt>
<dd>Blah. Obvious. Ugly. Not cross platform.</dd>
<dt>Make some pretty witty shell script(s) instead</dt>
<dd>Blah. Uninspired. Not cross platform.</dd>
<dt>Do it properly</dt>
<dd>Hmm ... OK, let's try that. Sigh, more choices.</dd>
</dl>
<h4>Second Choice - <em>AGAIN, HOW DO I DO IT?</em></h4>
<dl>
<dt>C/C++</dt>
<dd>Nah.</dd>
<dt>C#, <a href='http://www.mono-project.com/Main_Page'>Mono</a></dt>
<dd>Nyeee ... I'm not very familiar with neither the language itself, nor Mono.</dd>
<dt>Java</dt>
<dd>Nyeee ... an option. Or is it, really? An option?</dd>
<dt>Python <a href='http://www.pygtk.org/'>GTK</a>/<a href='http://wiki.python.org/moin/PyQt'>QT</a></dt>
<dd>Hmmm ... definitely an option.</dd>
<dt><abbr title='PHP Hypertext Preprocessor'>PHP</abbr> <a href='http://gtk.php.net/'>GTK</a>/<a href='http://php-qt.org/'>QT</a></dt>
<dd>Well, let's take a look at that first.</dd>
</dl>
<p>Once settled upon that, the choice between QT or GTK bindings wasn't really much of a choice, actually. QT bindings for <abbr title='PHP Hypertext Preprocessor'>PHP</abbr> are still quite young in development, plus the documentation is still virtually non-existent. Furthermore, as attached as I used to be to QT appearance, I recently switched to GNOME (at least until KDE 4.1) and learned to actually like it, so a GTK interface would be a natural choice for me right now.</p>
<h4>Enough. Let's get on with it already</h4>
<p>Now, once again, I had but a rhetorical choice between <abbr title='PHP Hypertext Preprocessor'>PHP</abbr>-GKT versions 1 and 2. In its second version it provides a good <abbr title='Object-Oriented'>OO</abbr> <abbr title='Application Programming Interface'>API</abbr>, some new components and quite stable foundation - although still in beta - so I obviously decided for V2. The new components (at least <code>GtkHtml</code> and <code>MozEmbed</code>, as I wanted to have <abbr title='Hyper Text Markup Language'>HTML</abbr> preview) turned out to be either non-existant, not (yet) fully cross platform, or quite buggy, so I left them out for now. No <abbr title='Hyper Text Markup Language'>HTML</abbr> preview. Pah.</p>
<p>What I did got, free of any charge, is the excellent <code>GtkScintilla</code> component - however, it turned out to be poorly documented and still somewhat rough around the edges. Oh, well. I used some of <a href='http://perl-win32-gui.sourceforge.net/cgi-bin/docs.cgi?doc=scintilla'>this</a>, some of <a href='http://scintilla.sourceforge.net/ScintillaDoc.html'>that</a>, some looking at the <code>GtkScintilla</code> source coupled with some blind luck, and it turned out OK - which I hope to <a href="http://malatestapunk-stuff.blogspot.com/2008/02/working-with-gtkscintilla.html">explain more thoroughly in a future post</a>.</p>
<p>Anyway, the ease of use really amazed me - without Glade or any other <abbr title='Graphical User Interface'>GUI</abbr>-editing <abbr title='Integrated Development Environment'>IDE</abbr>, it took just about a day or two to get a fully functional editor/blogging tool. The <abbr title='Object-Oriented'>OO</abbr> <abbr title='Application Programming Interface'>API</abbr> allowed for easy custom control creation which greatly improved the speed of development. So much so, that I decided to use it for a pending project for a client.</p>Vehttp://www.blogger.com/profile/10620632095701240843noreply@blogger.com0tag:blogger.com,1999:blog-8812012544837063627.post-70431573022812899112007-11-01T07:57:00.001+01:002007-11-01T07:57:54.594+01:0010 Dos and Don'ts When Using Microformats Parser<p><code>MicroformatParser</code> has actually been used in real world (ie. out of my sandbox testing grounds) for some time now, and I've been getting valuable feedback from developers. During that time, some of the most common problems - <em>and</em> some of the best practices to circumvent them - have emerged, and I thought it would be nice to collect them all in one place to share with others.</p>
<h4>Dos</h4>
<p>Please, <strong>do</strong>:</p>
<h5>... use Tidy</h5>
<p>The web is filthy, and you <em>do</em> need something to keep you clean. You can't just assume that you're working with well-formed XML from an external source- 9 out of 10 times the XML parser will choke and your script will croak because of that assumption.</p>
<p>What you can do is try and decrappify the input using <a href='http://www.php.net/manual/en/ref.tidy.php'>Tidy</a>. For a kick- start on using Tidy with PHP, you may want to check out <a href='http://malatestapunk-stuff.blogspot.com/2007/02/use-web-use-tidy.html'>this post</a> as well.</p>
<h5>... check your PHP version</h5>
<p>For PHP4, everything should just work right out of the box. However, for PHP5 you'll need <a href='http://alexandre.alapetite.net/doc-alex/domxml-php4-php5/domxml-php4-to-php5.php.txt'>this script</a>, by <a href='http://alexandre.alapetite.net/doc-alex/domxml-php4-php5/index.en.html'>Alexandre Alapetite</a>. He's done a great job of wrapping DOM XML extension API, making it available to PHP5 users.</p>
<h5>... check <code>xArray</code> documentation</h5>
<p>It may be tempting to just call <code>toArray()</code> method on the result and work with a familiar datatype. However, <code>xArray</code> is specifically crafted to facilitate working with collections of objects, such as your parsing results. The documentation is included in the package, and you can re-run <a href='http://www.phpdoc.org'>PhpDocumentor</a> over the source file to get it in a format you prefer. For more info on <code>xArray</code> you can also check out <a href='http://malatestapunk.wiki.zoho.com/xArray.html'>the documentation wiki</a>. It is a work in progress, but some valuable info is already there.</p>
<p>Also, there is a new <code>xArray</code> version on the way (v0.2), which will make handling complex trees of data even easier.</p>
<h5>... check if <code>(bool)FALSE</code> is returned</h5>
<p>On error, <code>MicroformatParser</code> returns <code>(bool)FALSE</code> <em>instead</em> of an <code>xArray</code> object. So make sure that everything went OK <strong>before</strong> you try to do anything further with the result:</p>
<pre><code>if($microformatsResult) ...
</code></pre>
<h5>... use caching</h5>
<p>Actual fetching of the remote page will most likely be the slowest part of your script (if it's not, something is seriously wrong). So, to shorten the execution time, implement some sort of caching mechanism in order to keep remote page fetching to minimum.</p>
<h5>... contact me</h5>
<p>This isn't really a "best practice" thing, but I think it's still worth keeping in mind. If you find a bug or just keep hitting the wall, don't hesitate to contact me. I'll try to help as much as I can.</p>
<h4>Don'ts</h4>
<p>There aren't as many of those, but they're just as important. So, please <strong>don't</strong>:</p>
<h5>... assume you're parsing well-formed XHTML</h5>
<p>Because it's just not true, most of the time.</p>
<h5>... use PHP5 DOM XML extension</h5>
<p>As of PHP 5.0, the required DOM XML extension is not bundled with PHP anymore. There is one available from <a href='http://pecl.php.net/'>PECL</a>, <strong>but you don't want to use that</strong>. Thanks to <a href='http://deneme-blog.blogspot.com/'>deneme</a>'s patience and valuable input we discovered that you can't really plug it in and expect everything to work. You should keep away from it and use <a href='http://alexandre.alapetite.net/doc-alex/domxml-php4-php5/index.en.html'>Alexandre Alapetite</a>'s solution instead.</p>
<h5>... use it for something malicious</h5>
<p>I can't really tell you what to do with it, but <strong>please don't use it for something bad</strong>, like email scraping. Would <em>you</em> like your name and email listed in some new directory handed down to generations of spammers? No, I bet you wouldn't. So don't do it to others, either.</p>
<h5>... output invalid XML (XHTML included)</h5>
<p>This is not strictly related to <code>MicroformatParser</code> usage, but it's a good advice nevertheless. Please, don't do that. The rest of the web will thank you for your effort.</p>
Vehttp://www.blogger.com/profile/10620632095701240843noreply@blogger.com0tag:blogger.com,1999:blog-8812012544837063627.post-36159776993398320042007-08-13T09:27:00.000+02:002007-08-14T07:12:23.067+02:00Spaghetti dinner: the beginning<p>After the <a href="http://malatestapunk-stuff.blogspot.com/2007/05/php-spaghetti.html">initial idea on possibilities of <abbr title="PHP Hypertext Preprocessor">PHP</abbr> metaprogramming with templates</a>, I pondered quite a bit on the subject. The thing is, I'm not too keen on template languages for a number of reasons - most of them being aptly disclosed in the <a href="http://malatestapunk-stuff.blogspot.com/2007/05/php-spaghetti.html">comment thread</a> by <a href="http://piepalace.ca/blog">this guy (e)</a>:</p>
<blockquote>
<p>Template languages are nice, but they are never as expressive as the language that are implemented in. Perhaps more importantly, they're harder to learn, and have much less documentation, meaning that any poor sap that has to maintain your templated solution is in for a world of hurt.</p>
</blockquote>
<p>Generally, I agree with every single word. However, I wondered if being much less expressive is exactly the reason why the template languages <strong>can</strong> be useful in a <em>particular context</em> - more precisely, could a small, simple project (a couple of pages in size) benefit from such a solution?</p>
<p>Using a full- blown framework is an overkill for such a project. Apart from inevitable execution speed compromises, you don't gain much in speed and ease of development either - you still have to write all your controllers, models and views, which is 3 times the number of spaghetti scripts you'd do. On the flip side, this way you at least have a maintainable application - revisiting spaghetti code can be a <em>real</em> pain.</p>
<p>That's where code generation from a template might come into play. It could provide at least some much needed project structuring, alleviate the common tasks (e.g. setting up the database, fetching and looping through results, stuff like that) and shift the development focus towards the UI<sup id="fnref:1"><a href="#fn:1" rel="footnote">1</a></sup>. Also, you never, ever touch the generated spaghetti code.</p>
<p>Anyway, that was the theoretical reasoning I decided to inquire. After some trial and error, I made a parser for an <abbr title="eXtensible Markup Language">XML</abbr> - based language, slightly resembling the new Blogger template language. Specifically, there are <code>loop</code> tags for iterating through arrays, function/method results or database results (which one is generated depending on the attributes provided), <code>data</code> tags for echoing variables and/or function/method results, <code>fetch</code> tags for fetching single database results, <code>set</code> tags for setting variables, <code>if</code> and <code>else</code> tags for... well, branching.</p>
<p>For generic tag case, <code>expr:</code> attribute name prefix indicates echoing result of the expression in the attribute value. Also, there are <code>var:</code> attribute value prefix that denotes a variable, and <code>data:</code> attribute value prefix that denotes echoing a variable - either one can appear anywhere in the attribute value.</p>
<p>Anything that exceeds these simple commands should go in <em>action</em> files (auto-included if present) - a global <code>actions.php</code>, plus a specific <code>*_actions.php</code> for every template file. For an example, that is where all the function and class definitions go.</p>
<p>My primary goal was to come up with a language that's simple and restrictive enough to facilitate the common tasks and enforce some code separation, without loosing too much in expressiveness. I feel I'm still not there yet, though things look better with every iteration. I haven't give much thought to generated project infrastructure just yet, but that's the next thing on my list. For an example, generated scripts are the user entry points right now (the old scripting standard - <code>blah.php</code> you request, <code>blah.php</code> you're gonna get). However, it may be better to move them away from the webroot and have a single entry point (a single <code>index.php</code>, dispatching the user requests to particular scripts). Among other things, that way it would be easier to set up "clean" URLs. On the other hand, that may seem a bit too much for the targeted scope.</p>
<div class="footnotes">
<hr>
<ol>
<li id="fn:1">
<p>While the last one might sound irrelevant to some, I learned that <abbr title="User Interface">UI</abbr> is one of most important aspects of a project, especially a small- scale one. Making an efficient <abbr title="User Interface">UI</abbr> is just as important as efficient backend processing, and easily becomes <em>the</em> most important part when there isn't much backend processing to begin with, as is often the case in small projects. <a href="#fnref:1" rev="footnote">↩</a></p>
</li>
</ol>
</div>Vehttp://www.blogger.com/profile/10620632095701240843noreply@blogger.com0tag:blogger.com,1999:blog-8812012544837063627.post-68990512342061245242007-05-31T12:52:00.000+02:002007-05-31T12:59:07.804+02:00PHP spaghetti<p>Last couple of days I've been toying with the idea of <abbr title="eXtensible Markup Language">XML</abbr> template-based <a href="http://en.wikipedia.org/wiki/Metaprogramming">metaprogramming</a>. Essentially, the idea is to create a simple <abbr title="Hyper Text Markup Language">HTML</abbr> page with some namespaced <abbr title="eXtensible Markup Language">XML</abbr> template elements (similar to <a href="http://help.blogger.com/bin/answer.py?answer=46995">Blogger new layout tags</a>) that would compile down to plain old <abbr title="PHP Hypertext Preprocessor">PHP</abbr> spaghetti before deploying. In other words, to have something like <strong>this</strong>:</p>
<pre><code><!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml"
xmlns:tpl='http://malatestapunk-stuff.blogspot.com'
xml:lang="en"
lang="en">
<tpl:set data="a_variable" value="a value" />
<tpl:set data="another_variable" action="FunctionName('arg1', 'arg2')" />
<tpl:if cond="data:a_variable == data:another_variable">
<tpl:data source="YetAnotherVarFromIncludedFile" />
</tpl:if>
<head>
<title><tpl:fetch source="TableArticles" data="Title" cond="id=1" /></title>
</head>
<body>
<tpl:loop source="TableArticles" cond="author='anyone'" as="Article">
<li>
<strong><tpl:data source="Article/Title" /></strong>
<em><tpl:data source="Article/Author" /></em>
<p> <tpl:data source="Article/Body" action="ClassName::methodName('arg2', 'arg3')" /> </p>
<p> <tpl:data action="Solo" /> </p>
</li>
</tpl:loop>
</body>
</html>
</code></pre>
<p>turned into something like <strong>this</strong>:</p>
<pre><code><?php
include ('config.php');
include ('actions.php');
$link = mysql_connect(DATABASE_HOST, DATABASE_USERNAME, DATABASE_PASSWORD);
mysql_select_db(DATABASE_NAME);
?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:tpl="http://malatestapunk-stuff.blogspot.com" xml:lang="en" lang="en">
<?php $a_variable = 'a value'; $another_variable = FunctionName('arg1', 'arg2'); if ($a_variable == $another_variable) { echo $YetAnotherVarFromIncludedFile; } ?>
<head>
<title><?php
$_db_result = mysql_query("SELECT Title FROM " . TABLE_PREFIX . "TableArticles WHERE id=1");
$Title = mysql_fetch_array($_db_result, MYSQL_NUM);
echo $Title [0];
?></title>
</head>
<body>
<?php
$_db_result = mysql_query("SELECT * FROM " . TABLE_PREFIX . "TableArticles WHERE author='anyone'");
while ($Article = mysql_fetch_array($_db_result, MYSQL_ASSOC)) {
?>
<li>
<strong><?php echo $Article['Title']; ?></strong>, by
<em><?php echo $Article['Author']; ?></em> said:
<p><?php echo ClassName::methodName($Article['Body'], 'arg2', 'arg3'); ?></p>
<p><?php echo Solo(); ?></p>
</li>
<?php } ?>
</body>
</html>
</code></pre>
<p>before deploying it on the server.</p>
<h4>Reasoning</h4>
<p>While this approach may <em>not</em> be good enough for any serious stuff, if all you need is to get some data from the database and massage it into some <abbr title="eXtensible Hypertext Markup Language">XHTML</abbr> (which is very often the case), it can be a fast and easy way out. By altering only templates, you keep your code/logic separate from your presentation, which is, of course, very important. However, when compiled to a script (which could easily be done either locally - say with <code>make</code> or a similar tool - or server-side) and deployed to a server, it most likely will outperform a more robust solution because of it's specific scope and thus, a smaller footprint.</p>
<p>Another advantage would be simplified <abbr title="Relational DataBase Management System">RDBMS</abbr> switching by extending the main parser class and rebuilding the page scripts, without the added overhead of introducing a database abstraction layer to the script itself. And, compared to working with a full-blown framework, this approach allows for a much shorter development cycle, which is kinda cool for rapid prototyping.</p>Vehttp://www.blogger.com/profile/10620632095701240843noreply@blogger.com2tag:blogger.com,1999:blog-8812012544837063627.post-66150692130870643662007-05-19T13:54:00.000+02:002011-10-18T06:51:13.376+02:00Advanced tips for Total CommanderAs I hinted in my <a href="http://malatestapunk-stuff.blogspot.com/2007/04/using-total-commander.html">previous post on Total Commander</a>, there are lots of ways it can help you in your day to day tasks. There are all sorts of <a href="http://ghisler.com/addons.htm">addons</a> available, but you can make it do even more - a little bit of shell scripting and documentation reading goes a long way here, because 5 minutes spent today can (and will) save you hours in the future.<br />
There are a couple of (easy) ways you can extend TC functionality with your own actions, tailored to suit your workflow. You can either use TC internal commands, or make your scripts accessible from the <b>button bar</b>, the <b>Start menu</b>, or the <b>directory hotlist</b>. There is a big advantage to the button bar and start menu usage, as you get the special runtime parameters you can pass to your scripts. However, I find these harder to access then the directory hotlist. I have <kbd>Ctrl+D</kbd> hardcoded in my fingers and it's right in front of my eyes all the time, so I'll stick to it.<br />
<h4>The Directory Hotlist menu</h4><a href="http://bp2.blogger.com/_4Vwug7lU7r0/Rk7ovktAz1I/AAAAAAAAAA4/HGSNa4Z_gnI/s1600-h/hotlist-btton.jpg"><img alt="Directory hotlist button in TC" border="0" id="BLOGGER_PHOTO_ID_5066242534842355538" src="http://bp2.blogger.com/_4Vwug7lU7r0/Rk7ovktAz1I/AAAAAAAAAA4/HGSNa4Z_gnI/s400/hotlist-btton.jpg" style="cursor: hand; cursor: pointer; float: left; margin: 0 10px 10px 0;" /></a> Sure, we all add our frequently used directories there, and some of us perhaps even organize it in sections. But we can make it do more useful things - there are <i>a lot</i> of builtin TC commands available from there and plus, anything that can be executed from the shell can be executed from there as well. Just open the menu (either with your mouse, or by pressing <kbd>Ctrl+D</kbd>), and select <kbd><samp>Configure</samp></kbd>. In a new window you'll get, click <kbd>Add Item</kbd> button, and enter the name for your new action. Then, either select an internal command from the <kbd>Command:</kbd> dropdown, or enter the shell command you wish to be executed. For an example:<br />
<pre><code>cmd /c "tree > tree.txt"
</code></pre>This will create a file named <code>tree.txt</code> in your currently open directory, with the tree of its subdirectories. OK, so this wasn't very useful but you get the picture. Moving along. <br />
<a href="http://bp0.blogger.com/_4Vwug7lU7r0/Rk7o7EtAz2I/AAAAAAAAABA/wCmg-NiNjYw/s1600-h/command.jpg"><img alt="TC Command dialog" border="0" id="BLOGGER_PHOTO_ID_5066242732410851170" src="http://bp0.blogger.com/_4Vwug7lU7r0/Rk7o7EtAz2I/AAAAAAAAABA/wCmg-NiNjYw/s400/command.jpg" style="cursor: hand; cursor: pointer; display: block; margin: 0px auto 10px; text-align: center;" width="500px" /></a> <br />
Now, suppose I have some movies on my HD that I want to know more about. It is quite trivial to find the information on the 'net - I'd just go to <a href="http://www.imdb.com/">IMDb</a> and do a search for the title. However, it is a multi-step operation: open my browser, type the address, type the movie title, hit search. Obviously, that <i>should</i> be easier - I mean, that's why we have computers in the first place, right? Right. So let's make a script to automate this task for us and make it easily accessible in TC hotlist.<br />
<h4>What you <code>sed</code> is what you get</h4><div style="border-bottom: 1px solid #FdA; border-top: 1px solid #FdA;"><div style="margin: 0.4em 2em;"><i>Note</i>: you will need some special tools for this - namely, <code>sed</code> and <code>gawk</code> (you should have them around anyway - it's really good stuff). Fortunately, they're freely available as a part of <a href="http://unxutils.sourceforge.net/">UnxUtils package</a>, a porting effort of some common GNU utilities to native Win32.<br />
After you download the archive, unpack the <code>sed.exe</code> and <code>gawk.exe</code> from <kbd>usr/local/wbin</kbd> to directory in your path (eg. <kbd>C:\WINDOWS</kbd>). </div></div>Thinking about it, almost every movie I encountered was in a directory named after its title. So let's use that as the starting point:<br />
<pre><code>cd | sed "s/.*\\//"
</code></pre>We use <code>cd</code> output to find out the current working directory. Unfortunately, <code>cd</code> outputs the full path (eg. <samp>D:\Desktop\Batch</samp>) instead of what we need. That's why we sent the output to <code>sed</code> - this will replace everything up to the last <code>\</code> with nothing (if you'd like to know more about <code>sed</code>, there are a lot of good tutorials around. You may want to try <a href="http://www.grymoire.com/Unix/Sed.html">this one</a>).<br />
There is yet another problem: <code>cmd.exe</code> is not cool at all. We can't just assign the output we just got to a shell variable in order to use it later - not in a straightforward way (using <code>SET</code>), anyway. However, there <i>is</i> a way to do just that:<br />
<pre><code>cd | sed "s/.*\\/set directory_name=/" > tmp_.bat
call tmp_.bat
del tmp_.bat
</code></pre>The code above will replace everything up to the last <code>\</code> from the <code>cd</code> output with <code>set directory_name=</code> and dump that in a file named <code>tmp_.bat</code>, so the newly created file would contain something like:<br />
<pre><code>set directory_name=Whatever Directory Name
</code></pre>Next, when we <code>CALL</code> the <code>tmp_.bat</code>, it will import the <code>directory_name</code> environment variable into the current namespace. After that, we don't need the intermediate script anymore, so we delete it (using <code>DEL</code>). And the hard part is actually over: the only thing left is to do the search. Fortunately, IMDb accepts <code>GET</code> search requests, so this is trivial:<br />
<pre><code>"C:Program Files\Mozilla Firefox\firefox.exe" "http://www.imdb.com/find?s=tt&q=%directory_name%"
</code></pre>This will open [Firefox] at the given URL - which is the IMDb search results we were after all along. Now, we wrap this up in a batch file:<br />
<pre><code>cd | sed "s/.*/set nme=/" > tmp_.bat
call tmp_.bat
del tmp_.bat
"C:Program Files\Mozilla Firefox\firefox.exe" "http://www.imdb.com/find?s=all&q=%nme%"
</code></pre>and save it somewhere. Next, we do <kbd>Ctrl+D</kbd> -> <kbd>Configure</kbd> -> <kbd>Add Item</kbd>, enter a descriptive name for our new action, and enter this at the <kbd>Command:</kbd> line:<br />
<pre><code>cmd /c "Full_Path_To_Your_Batch_File.bat"
</code></pre>Every time you execute your new action, the script will run in the current directory - the one you're browsing in currently active TC pane.<br />
<div style="border-bottom: 1px solid #FdA; border-top: 1px solid #FdA;"><div style="margin: 0.4em 2em;"><i>Note</i>: there may be more elegant ways of doing such things, but I like to keep my scripts hackish and simple. For me, a shell script is a quick and dirty way out - it should be intuitive to write and quickly provide the results without too much hassle. Anything that needs to be elegant should be done in a proper programming language anyway.</div></div><h4>Search for more info on MP3s</h4>I like to know stuff about what I'm listening to. There is a lot of information available on the web about different artists and albums, and I find myself often doing repetitive work to get to it. However, it is quite easy to get all (or most) of it at once with a simple script, similar to the one we just did.<br />
However big your MP3 collection is, chances are that most of the files are in some of the most frequent folder schemes - either <code>ArtistName\AlbumName</code> or <code>ArtistName - AlbumName</code>. The approach is very similar for both cases, we'll use <code>gawk</code> to extract the information from the path provided by <code>cd</code> command. First, the <code>ArtistName\AlbumName</code> case:<br />
<pre><code>cd | gawk -F\\ "{print (\"set artist=\" $(NF-1) \"\nset album=\" $NF)}" > tmp_.bat
</code></pre>We'll use the same method to obtain the command output as environment variables, but using a different tool (<code>gawk</code>) to get it done. First we declare <code>\</code> to be a <i>field separator</i> - that means that <code>gawk</code> will split into separate fields whatever we give it at each <code>\</code>. We can access fields by attaching <code>$</code> to the number of field that we want, left-to-right (<code>$1</code> for the first one, <code>$2</code> for the second, etc). Since we want the one on the far right, we use <code>NF</code> - it means <i>number of fields</i> in the <code>gawk</code> lingo. We also need the one just before the last, so we subtract 1 from the number of fields - that's what $(NF-1) means.<br />
If none of this makes sense to you, don't worry. <code>Awk</code> (<code>gawk</code> is just a flavor of <code>awk</code>) is very well documented, and there are a lot of tutorials around. The basics are covered <a href="http://www.vectorsite.net/tsawk.html">here</a>, and if you'd like to learn more, try <a href="http://www.uga.edu/%7Eucns/wsg/unix/awk/">this one</a>.<br />
We also need to print it, and therefore we need double quotes. That is why we <i>escape</i> the quotes in the <code>print</code> statement by adding <code>\</code> in front of them. This way, <code>cmd.exe</code> will leave those quotes alone, and let <code>gawk</code> parse it instead. Note that we need each <code>SET</code> statement to be on the separate line in the intermediate batch file. That's why we separate them with <code>\n</code>, which means <i>"insert newline here"</i>.<br />
The <code>ArtistName - AlbumName</code> case is quite similar. We'll build on what we got from the previous scripts:<br />
<pre><code>cd | sed "s/.*\\//" | gawk -F- "{print (\"set artist=\" $1 \"\nset album=\" $2)}" > tmp_.bat
</code></pre>So, first we extract the last directory in the current path, just like in the IMDb search script. Next, we declare '<code>-</code>' to be <i>field separator</i> because we want to split the directory name at dash character. Since we assume that artist name is on the left side of the dash, and everything else to be album name (we'll be doing a web search, so we don't have to get surgical about this), we can use simple <code>$1</code> and <code>$2</code>.<br />
Once we have our <code>%artist%</code> and <code>%album%</code> variables set in the intermediate batch file, the rest of the script is the same for both cases. We use the old technique to get to them and then we put them to use:<br />
<pre><code>REM ...
call tmp_.bat
del tmp_.bat
REM +---------------------------------------------------------------------------------
REM This is where we set up search URLs.
REM Comment out the ones you don't need, or add some more.
REM +---------------------------------------------------------------------------------
set discogs="http://www.discogs.com/search?type=artist&q=%artist%"
set cduni_artist="http://www.cduniverse.com/sresult.asp?HT_Search_Info=%artist%&HT_Search=ARTIST"
set cduni_album="http://www.cduniverse.com/sresult.asp?HT_Search_Info=%album%&HT_Search=TITLE"
set gimg_artist="http://images.google.com/images?q=%artist%"
set gimg_album="http://images.google.com/images?q=%album%"
set gimg_both="http://images.google.com/images?q=%artist% %album%"
set wiki="http://en.wikipedia.org/wiki/Special:Search?search=%artist%"
REM +---------------------------------------------------------------------------------
REM OK, were all set, let's do some searching:
REM +---------------------------------------------------------------------------------
"C:Program Files\Mozilla Firefox\firefox.exe" %discogs% %cduni_artist% %cduni_album% %gimg_artist% %gimg_album% %gimg_both% %wiki%
</code></pre>Now, we just substitute the <code>REM ...</code> line with the appropriate gawk command, and the script is ready to go into the directory hotlist as a brand new action.Vehttp://www.blogger.com/profile/10620632095701240843noreply@blogger.com0tag:blogger.com,1999:blog-8812012544837063627.post-38841539529449453612007-04-20T10:36:00.000+02:002007-04-20T10:52:43.902+02:00Using Total Commander<p>Although I'm not very keen on Shareware, <a href="http://www.ghisler.com/">Total Commander</a> is <strong>the</strong> file manager for windows and for me, nothing comes even close to it. It has many useful features, both basic and advanced, to make your life so much easier. It's extensible, too: there are <a href="http://www.ghisler.com/plugins.htm">plugins</a> for <a href="http://ghisler.fileburst.com/fsplugins/b2l4tc.zip">file undeletion</a>, <a href="http://ghisler.fileburst.com/fsplugins/ex2fs.zip">accessing ext2, ext3 and reiser partitions</a>, opening <a href="http://ghisler.fileburst.com/plugins/bzipplug.zip">bz2</a>, <a href="http://ghisler.fileburst.com/plugins/wc_rpm-1.5.zip">rpm</a>, <a href="http://ghisler.fileburst.com/plugins/iso_plugin.zip">iso</a>, <a href="http://ghisler.fileburst.com/plugins/iclread.zip">icl</a>, <a href="http://ghisler.fileburst.com/plugins/msi_plugin.zip">msi</a> files, <a href="http://ghisler.fileburst.com/content/wdx_xpdfsearch.zip">fulltext searching in PDFs</a>, accessing your <a href="http://www.acd-group.com/Softwaredevelopment/POP3Plugin"><abbr title="Post Office Protocol, Version 3">POP3</abbr>/<abbr title="Simple Mail Transport Protocol">SMTP</abbr> server</a>, <a href="http://www.geocities.com/rs0319/">Symbian telephone</a> or <a href="http://www.totalcmd.net/plugring/IrivIFPPlug.html">iRiver flash player</a>, etc. And it can be <a href="http://www.ghisler.com/usbinst.htm">copied to your <abbr title="Universal Serial Bus">USB</abbr> stick</a>, so you can carry it around with you.</p>
<p>Even without a single addon, it is by far the most useful file manager I ever used. It has internal zip/gzip archive support, advanced renaming options for multiple files, file/directory comparison, different views and filters, <abbr title="File Transfer Protocol">FTP</abbr> support, great search capabilities, two-paned interface with tabs, history and configurable bookmarks.</p>
<p>Actually, configurable bookmarks are perhaps the most valuable, and yet the most underrated feature of this tool. While you can just add locations you frequently visit (by clicking on the "Add current dir" option in the dropdown list), you can add some really useful actions there as well, while keeping everything neatly organized, hierarchically. This can be done by clicking on the "Configure" option at the bottom of the dropdown list in a simple and convenient interface.</p>
<p>While adding directory locations and organizing the menu are pretty obvious tasks, adding actions is not as straightforward (still, it is quite easy). You can add any kind of action that you'd do in a shell - for an example, you can make a quick file list text file with this action:</p>
<pre><code>cmd.exe /c "dir /b /oG > listing.txt"</code></pre>
<p>Of course, that would list all the files in the current working directory - the one that's open in your active TC pane. <strong>Hint</strong>: you may want to consider installing <a href="http://unxutils.sourceforge.net/">Unix utils</a> somewhere in your path for even more power. For an example, you could generate a track list for audio CD cover from your mp3 folder with an action command like this:</p>
<pre><code>cmd /c "dir /B /oG *.mp3 | sed s/\.mp3\b//i | gawk "{print NR \") \" $0}" > list.txt"</code></pre>
<p>As of v5.51, you can also choose one of Total Commander's internal commands from the dropdown combobox - e.g. <code>cm_OpenDesktop</code> to switch to the Desktop folder. Even more useful internal commands (at least, the ones I use all the time) include:</p>
<dl>
<dt><code>cm_CopyNamesToClip</code></dt>
<dd>Puts name(s) of selected file(s) without path information on the clipboard - my girlfriend uses this all the time to make Photoshop audio CD covers. This works on directories as well.</dd>
<dt><code>cm_CopyFullNamesToClip</code></dt>
<dd>Puts full path(s) of selected file(s) on the clipboard. This also works on directories.</dd>
<dt><code>cm_CountDirContent</code></dt>
<dd>Quickly calculate the total size of one or more selected directories.</dd>
<dt><code>cm_SelectCurrentExtension</code></dt>
<dd>Selects files with the same extension as the one that's currently selected.</dd>
</dl>
<p>Now, I <em>could</em> just install about a dozen (or more) other tools instead and do all the stuff I do with TC, but it's so great having all that in just one tool - it keeps my workflow uninterrupted, my desktop/quicklaunch/start menu uncluttered and my fragile spiritual balance undisturbed.</p>
<p>Of course, there are downsides: TC is a Windows-only tool (although it has been ported to Windows CE/Pocket PC). Also, it is a Shareware program, which means that you can test it for a period of 30 days. After testing the program, you are expected to either order the full version, or delete the program from your harddisk. Ouch.</p>
<p>There are few freeware alternatives - one of the most usable ones I tried is <a href="http://www.freecommander.com/">FreeCommander</a>. Unfortunately, it has a much more limited set of builtin features and no plugin support - though it still beats the hell out of Explorer.</p>
<p>In the free world, many people - including me - feel that <a href="http://www.ibiblio.org/mc/">mc (Midnight Commander)</a> is still the best file manager there is. However, a relatively recent project looks quite promising: <a href="http://www.nongnu.org/gcmd">GNOME Commander</a> reached version 1.2.3 (stable), with plugin support announced to be introduced in version 1.3.</p>Vehttp://www.blogger.com/profile/10620632095701240843noreply@blogger.com1tag:blogger.com,1999:blog-8812012544837063627.post-84346443615590585342007-03-05T12:45:00.000+01:002007-07-20T09:52:00.790+02:00Twango + widgets = Twidgets<p><a href="http://www.twango.com/">Twango</a> is a media sharing service I use mainly to post photos of myself and my girlfriend so we don't have to go out and relate to other people. It is also an excellent way to show all your friends and family how would you look with Ron Jeremy mustaches without freaking out people in public transportation. For all those that wonder, Twango:</p>
<blockquote>
<p>... <em>is a term used by guitar players to refer to a certain style of twangy-sounding guitar music. You might call it a kind of "cowboy surf" sound</em></p>
</blockquote>
<p>As it happens, I kind of like cowboy surf. I also happen to like services that allow you to upload and share your stuff in lots of different (and quite advanced) ways. As far as I understood from <a href="http://www.twango.com/faq">their <abbr title="Frequently Asked Questions">FAQ</abbr> section</a>, a public <abbr title="Application Programming Interface">API</abbr> is on the way as well, so Twango does it for me.</p>
<p><em>Twidgets</em> are Twango widgets -- code snippets you can use to integrate your Twango channels with other content. They come in two flavors: flash and javascript, and there is a number of them ready to be used. However, since neither one of them completely suited my (quite unstable) taste - <em>and</em> since I like to fiddle with stuff - I rolled up my sleeves and <a href="http://www.twango.com/channel/malatestapunk.twidgets">made a couple of them on my own</a>. You can see an example result <del>in the sidebar of this site.</del> right here:</p>
<script src='http://feeds.twango.com/js/channelmedia.aspx?channelname=malatestapunk.stuff' type='text/javascript'></script>
<script type='text/javascript'>
var IeTransparent = (document.all) ? "white" : "transparent";
</script>
<script type='text/javascript'>
//<![CDATA[
function rotatorTwidget (prefs) {
var divID = "twangoRotator";
var imgOutDiv;
var outDiv;
var twangoImages = new Array();
var _imgWidth = 100;
var _imgHeight = 100;
var _divHeight = 0;
var _divWidth = 0;
var _scrollerWidth = 10;
var _scrollerUpBg = false;
var _scrollerDnBg = false;
var _scrollerColor = "#dddddd";
var _scrollerColorActive = "red";
var _onAction = "onclick";
var initDivs = function () {
_divHeight = (_divHeight) ? _divHeight : _imgHeight;
_divWidth = (_divWidth) ? _divWidth : _imgWidth + (2*_scrollerWidth);
outDiv.style.position = "relative";
outDiv.style.width = _divWidth + "px";
outDiv.style.height = _divHeight + 'px';
outDiv.style.overflow = "hidden";
imgOutDiv = document.createElement ('div');
upDiv = document.createElement ('div');
dnDiv = document.createElement ('div');
upDiv.style.position = "absolute";
upDiv.style.top = "0";
upDiv.style.left = "0";
if (_scrollerUpBg) {
upDiv.style.width = _scrollerWidth + "px";
upDiv.style.height = _divHeight + "px";
upDiv.style.backgroundImage = "url(" + _scrollerUpBg + ")";
upDiv.style.backgroundPosition = '0px ' + ((_divHeight-_imgHeight)/2) + 'px';
upDiv.style.backgroundRepeat = "no-repeat";
} else {
upDiv.style.width = "0px";
upDiv.style.height = "0px";
upDiv.style.borderStyle = "solid";
upDiv.style.borderLeftStyle = "none";
upDiv.style.borderTopWidth = _divHeight/2 + 'px';
upDiv.style.borderTopColor = IeTransparent;
upDiv.style.borderBottomWidth = _divHeight/2 + 'px';
upDiv.style.borderBottomColor = IeTransparent;
upDiv.style.borderRightColor = _scrollerColor;
upDiv.style.borderRightWidth = _scrollerWidth + "px";
}
upDiv[_onAction] = getPrev;
dnDiv.style.position = "absolute";
dnDiv.style.top = "0";
dnDiv.style.left = _divWidth - _scrollerWidth + 'px';
if (_scrollerDnBg) {
dnDiv.style.width = _scrollerWidth + "px";
dnDiv.style.height = _divHeight + "px";
dnDiv.style.backgroundImage = "url(" + _scrollerDnBg + ")";
dnDiv.style.backgroundPosition = 'bottom ' + (_divWidth-_imgWidth)/2 + 'px';
dnDiv.style.backgroundRepeat = "no-repeat";
} else {
dnDiv.style.width = "0px";
dnDiv.style.height = "0px";
dnDiv.style.borderStyle = "solid";
dnDiv.style.borderRightStyle = "none";
dnDiv.style.borderTopWidth = _divHeight/2 + 'px';
dnDiv.style.borderTopColor = IeTransparent;
dnDiv.style.borderBottomWidth = _divHeight/2 + 'px';
dnDiv.style.borderBottomColor = IeTransparent;
dnDiv.style.borderLeftColor = _scrollerColor;
dnDiv.style.borderLeftWidth = _scrollerWidth + "px";
}
dnDiv[_onAction] = getNext;
outDiv.appendChild(upDiv);
outDiv.appendChild(imgOutDiv);
outDiv.appendChild(dnDiv);
__current = 0;
}
var getNext = function () {
if (__current+1 >= twangoImages.length) return false;
__current++;
putImageThumb(__current-1);
}
var getPrev = function () {
if (__current == 0) return false;
__current--;
putImageThumb(__current+1);
}
var putImageThumb = function (old) {
if (old !== false) imgOutDiv.removeChild(twangoImages[old]);
imgOutDiv.appendChild(twangoImages[__current]);
}
var init = function () {
if (twangoImage_array.length <= 0) return false;
outDiv = document.getElementById(divID);
outDiv.innerHTML = '';
initDivs();
var im, alink;
for (i=0; i<twangoImage_array.length; i++) {
alink = document.createElement ('a');
alink.href = "http://www.twango.com/media/"+channelName+"/" + twangoImage_array[i][0];
alink.target = "_blank";
im = new Image();
im.style.border = "none";
im.id = twangoImage_array[i][0];
im.alt = twangoImage_array[i][0];
im.title = twangoImage_array[i][0];
im.src = twangoImage_array[i][1];
im.style.position = "absolute";
im.style.left = (((_divWidth - _imgWidth)/2) > _scrollerWidth) ?
((_divWidth - _imgWidth)/2) + 'px' : _scrollerWidth + "px";
im.style.top = (_divHeight-_imgHeight)/2 + 'px';
alink.appendChild(im);
twangoImages[i] = alink;
}
putImageThumb(false);
}
var parsePrefs = function () {
if (!prefs) return false;
if (prefs['divID']) divID = prefs['divID'];
if (prefs['onAction']) _onAction = prefs['onAction'];
if (prefs['width']) _divWidth = prefs['width'];
if (prefs['height']) _divHeight = prefs['height'];
if (prefs['scrollerUpBg']) _scrollerUpBg = prefs['scrollerUpBg'];
if (prefs['scrollerDnBg']) _scrollerDnBg = prefs['scrollerDnBg'];
if (prefs['scrollerWidth']) _scrollerWidth = prefs['scrollerWidth'];
if (prefs['scrollerColor']) _scrollerColor = prefs['scrollerColor'];
}
parsePrefs ();
init();
}
//]]>
</script>
<script type='text/javascript'>
//<![CDATA[
window.onload = function () {
new rotatorTwidget (
{
'divID': 'twangoWidget',
'width': 130
}
);
}
//]]>
</script>
<div class='widget'>
<h2>My Twango photos</h2>
<div id='twangoWidget'>
<p class='internalError'>Fetching Twango feed ...</p>
</div>
</div>
<p>The scripts <em>should</em> work in most recent browsers - I tested them in IE7, FF2 and O9.</p>
<p>I also made a <a href="http://www.twango.com/channel/malatestapunk.twidgets">Twango channel dedicated to twidget development</a>, so feel free to browse.</p>Vehttp://www.blogger.com/profile/10620632095701240843noreply@blogger.com1tag:blogger.com,1999:blog-8812012544837063627.post-49720615449293868792007-02-17T10:31:00.000+01:002007-03-08T10:34:37.311+01:00Use the Web: use Tidy<p>Once you start web programming, it's only a matter of time before you face the task of fetching and parsing the existing contents - be it a web page, a feed, or whatever. Since that content is most likely to be a markup language of some sort, it would be great if you could use a generic parser to weed through it. In fact, since you <em>know</em> your input is some <abbr title="eXtensible Markup Language">XML</abbr> dialect (say, <abbr title="eXtensible Hypertext Markup Language">XHTML</abbr>, RSS or Atom), it would be great if you could just use some ready-made <abbr title="eXtensible Markup Language">XML</abbr> parser to reach the portions you need through XPath expressions or <abbr title="Document Object Model">DOM</abbr> functions. But that is where the grief begins.</p>
<h4>Guess what? It's not gonna work.</h4>
<p>So the page you're fetching is boasting to be <abbr title="eXtensible Hypertext Markup Language">XHTML</abbr> strict, but your parser keeps croaking on you. Why is that happening? Have you done something in your previous life to annoy a deity of some sort? Well actually, most existing <abbr title="eXtensible Markup Language">XML</abbr> parsers are quite picky - and they should be, since there is only handful of rules they expect to be fulfilled. However, for one reason or another - and this especially goes for <abbr title="eXtensible Hypertext Markup Language">XHTML</abbr>, since people tend to take feed validity more seriously - they seldom are.</p>
<h4>So, what do you do?</h4>
<p>Meet you new best friend: <a href="http://tidy.sourceforge.net/">HTML Tidy</a>. It will do all the nasty cleanup and repair stuff for you, and leave you with an usable document. Originally a <a href="http://www.w3.org/People/Raggett/">Dave Ragget</a> utility program, it is now maintained by <a href="http://tidy.sourceforge.net/">a group of dedicated volunteers on SourceForge</a>. One of their goals was to make a <q>a library form of Tidy</q>, <q>to make it easier to incorporate Tidy into other software</q>.</p>
<p>And that they did - for an example, Tidy is an integral part of many current applications, it is available as a PECL extension for <abbr title="PHP Hypertext Preprocessor">PHP</abbr> 4.3.x and <abbr title="PHP Hypertext Preprocessor">PHP</abbr> 5 from <a href="http://pecl.php.net/package/tidy">http://pecl.php.net/package/tidy</a>, and there are bindings for <a href="http://users.rcn.com/creitzel/tidy.html">many other languages as well</a>.</p>
<h4>Using Tidy PHP extension</h4>
<p>There are two flavors of Tidy for <abbr title="PHP Hypertext Preprocessor">PHP</abbr>: Tidy 1.0 is just for <abbr title="PHP Hypertext Preprocessor">PHP</abbr> 4.3.x, while Tidy 2.0 is just for <abbr title="PHP Hypertext Preprocessor">PHP</abbr> 5. This is how you'd use Tidy 1.0 with 4.3:</p>
<pre><code>// ...
// Let's assume you already obtained the page you want to clean up in string $html
// ...
$config = array (
'ncr' => true, // allow numeric entities
'numeric-entities' => true, // output numeric instead of named entities
'quote-nbsp' => true, // quote non-breaking space character
'fix-uri' => true, // fix ampersands and such in URIs
'output-xml' => true, // output XML; could be XHTML as well, I think
'char-encoding' => 'utf8' // use UTF-8 encoding
);
tidy_parse_string($html);
foreach ($config as $key=>$value) {
tidy_setopt($key, $value);
}
tidy_clean_repair();
$html = tidy_get_output();
// ...
// Now $html contains cleaned up original page, ready for XML parser
// ...
</code></pre>
<p>the most important part being the <code>$config</code> array. This is where we set up Tidy to make the input string <code>$html</code> parser-friendly. There are a lot of other parameters for Tidy, but these are the basic ones that should correct almost any page. For a full reference on Tidy parameters, check out <a href="http://tidy.sourceforge.net/docs/quickref.html">http://tidy.sourceforge.net/docs/quickref.html</a>.</p>
<h4>Using Tidy executable</h4>
<p>On the down side, it's possible you don't have Tidy extension around in your environment. If that's the case, you might be able to use the Tidy standalone. To do that, you first need to make a Tidy configuration file. This is an example file, with the same options as in the above example:</p>
<pre><code>ncr: 1 # allow numeric entities
numeric-entities: 1 # output numeric instead of named entities
quote-nbsp: 1 # quote non-breaking space character
fix-uri: 1 # fix amperstands and such in URIs
output-xml: 1 # output XML; could be XHTML as well, I think
char-encoding: utf8 # use UTF-8 encoding
</code></pre>
<p>Save that file as <code>tidy.conf</code> in the same directory where your script is. Next, in your script, do something along these lines:</p>
<pre><code>// ...
// Let's assume you already obtained the page you want to clean up in string $html
// ...
define ('PATH_TO_YOUR_CONFIG_FILE', 'tidy.conf', true);
$filename = tempnam("", "OUT");
$fp = fopen($filename, 'w');
fwrite($fp, $html);
fclose($fp);
$cmd = 'tidy -q -config "' . realpath(PATH_TO_YOUR_CONFIG_FILE) . '" '.$filename;
$html = shell_exec (escapeshellcmd ($cmd));
unlink ($filename);
// ...
// Now $html contains cleaned up original page, ready for XML parser
// ...
</code></pre>
<p>And that should be it. Once again, note that Tidy accepts <strong>a lot</strong> of (well documented) configuration parameters. For a full list, check out <a href="http://tidy.sourceforge.net/docs/quickref.html">http://tidy.sourceforge.net/docs/quickref.html</a>.</p>Vehttp://www.blogger.com/profile/10620632095701240843noreply@blogger.com1tag:blogger.com,1999:blog-8812012544837063627.post-28632272170055650482007-01-13T06:50:00.000+01:002007-03-06T10:55:44.594+01:00PHP microformats parser<p><a href="http://www.phpclasses.org/browse/package/3597.html">Microformats parser</a> is a <abbr title="PHP Hypertext Preprocessor">PHP</abbr> package for extracting the microformats data
embedded into <abbr title="Hyper Text Markup Language">HTML</abbr>. The gathered data is stored as an <a href="http://www.phpclasses.org/browse/package/3565.html">xArray</a> of objects -
one for each microformat type container found.</p>
<h4>Requirements</h4>
<p><a href="http://www.phpclasses.org/browse/package/3597.html">Microformats parser</a> requires <abbr title="PHP Hypertext Preprocessor">PHP</abbr> 4.3, with <a href="http://www.php.net/manual/en/ref.domxml.php"><abbr title="Document Object Model">DOM</abbr> <abbr title="eXtensible Markup Language">XML</abbr></a> extension. Since <abbr title="Document Object Model">DOM</abbr> <abbr title="eXtensible Markup Language">XML</abbr> extenstion doesn't ship with PHP5 anymore, there was a problem that was solved thanks to Alexandre Alapetite and Ludwig. Now it is possible to make it work with PHP5 by following <a href="http://alexandre.alapetite.net/doc-alex/domxml-php4-php5/index.en.html">this article</a>.</p>
<p><a href="http://www.phpclasses.org/browse/package/3597.html">Microformats parser</a> requires <a href="http://www.phpclasses.org/browse/package/3565.html">xArray</a> package that's not included by default. So, in order to use this package, you need to download the <a href="http://www.phpclasses.org/browse/package/3565.html">xArray</a> package (you can do it <a href="http://www.phpclasses.org/browse/package/3565.html">here</a>) and extract the xArray.php file into the <code>lib/</code> directory of your parser.</p>
<h4>Supported microformats</h4>
<p>The parser supports <em>most</em> of the <a href="http://microformats.org/wiki/hcard">hCard</a> (missing SOUND), <a href="http://microformats.org/wiki/hcalendar">hCalendar</a>, <a href="http://microformats.org/wiki/hreview">hReview</a> (missing item info; spec really needs some clarification) and <code>rel</code> elements, according to their respective specification on <a href="http://microformats.org/wiki">microformats Wiki</a>.</p>
<h4>Usage</h4>
<p>The simplest usage example:</p>
<pre><code>$filename = "http://microformats.org/about/people/";
$html = file_get_contents($filename);
$mfParser = new MicroFormatParser();
$mf = $mfParser->parseSource($html);
if ($mf) $mf->each('
echo "<h1>".get_class($value)."</h1>";
var_export($value);
echo "<hr />";
');
</code></pre>
<p>As you can see, the parser expects <abbr title="Hyper Text Markup Language">HTML</abbr> string input. That is because there is a lot of different ways you can fetch a page, so you're free to use whichever one works for you. Another reason is that <abbr title="Document Object Model">DOM</abbr> <abbr title="eXtensible Markup Language">XML</abbr> expects valid <abbr title="eXtensible Markup Language">XML</abbr> - in our case, an <abbr title="eXtensible Hypertext Markup Language">XHTML</abbr> document. Since many pages out there are <em>near valid</em> but not <em>really, really valid</em>, you can use <a href="http://www.php.net/manual/en/ref.tidy.php"><abbr title="PHP Hypertext Preprocessor">PHP</abbr> Tidy functions</a> (if available on your machine) to prevent parser choking to death.</p>
<p>The parser returns <code>false</code> on failure, or an xArray object with all of the microformats it finds otherwise. Therefore, it is good practice to always check the result for <code>false</code> before anything else.</p>
<h5>A note on xArray object</h5>
<p>xArray is created after <a href="http://www.prototypejs.org/">Prototype</a> Enumerable object, in order to facilitate array manipulation. It takes some time getting used to, but allows quite clever stuff. I tried my best to keep the source clean and well-commented, and there are some pre-built docs for it in the <code>docs/</code> folder. You can re-run phpDocumentor over the sources to generate the output that suits you best. However, if you don't like the way it works, you can always use its <code>toArray()</code> method to get the good ol' <abbr title="PHP Hypertext Preprocessor">PHP</abbr> array out of it. Here is an example of this:</p>
<pre><code>$filename = "http://microformats.org/about/people/";
$html = file_get_contents($filename);
$mfParser = new MicroFormatParser();
$mf = $mfParser->parseSource($html);
if ($mf) var_export($mf->toArray());
</code></pre>
<h5>A bit more advanced usage example</h5>
<p>Before you call the <code>parseSource()</code> method, you can calibrate the parser to extract just the microformats you're after. You do that by passing a hash of options to the <code>parserSetup</code> method, like this:</p>
<pre><code>$mfParser->parserSetup (array (
'hcard' => true,
'hreview' => true,
'hcalendar' => true,
'reltag' => true,
));
</code></pre>
<p>The parser will fetch <em>all</em> the microformats it finds by default, so the previous code just augments the default behavior. However, doing something like this will seriously limit your search (and memory usage ;)):</p>
<pre><code>$mfParser->parserSetup (array (
'hcard' => true,
'hreview' => false,
'hcalendar' => false,
'reltag' => false,
));
</code></pre>
<p>Please note that you have to do this <strong>before</strong> you call <code>parseSource()</code> method. So the full example source would be:</p>
<pre><code>$filename = "http://microformats.org/about/people/";
$html = file_get_contents($filename);
$mfParser = new MicroFormatParser();
$mfParser->parserSetup (array (
'hcard' => true,
'hreview' => false,
'hcalendar' => false,
'reltag' => false,
));
$mf = $mfParser->parseSource($html);
if ($mf) $mf->each('
echo "<h1>".get_class($value)."</h1>";
var_export($value);
echo "<hr />";
');
</code></pre>Vehttp://www.blogger.com/profile/10620632095701240843noreply@blogger.com4