Monday, December 03, 2012

[off-topic] Perl or PCRE: sort strings with numbers

A little trick with regular expressions (if backtracking is supported) on how to compare two strings which might include number.

The trick is to join the strings with NUL character (never occurring in human readable strings anyway) and use it as an anchor to find the longest common sub-string, in both strings followed by a number. And then compare the numbers.

#!/usr/bin/env perl
use strictuse warnings;

sub cmp_str_with_numbers
{
        #my ($a, $b) = @_;
        warn $a."<=>".$b;
        my $s = $a."\x00".$b;
        if ($s =~ m/^(.*)(\d+).*?\x00\1(\d+)/) {
                if ($2 != $3) {
                        return $2 <=> $3;
                }
        }
        return $a cmp $b;
}

my @test1 = (
        'Test 2 ccc',
        'Test 1 aaa 1',
        'Test 1 aaa 10',
        'Test 1 aaa 2',
        'Test 10 bbb',
);

my @out0 = sort @test1;
my @out1 = sort cmp_str_with_numbers @test1;
print "original:\n";
print "\t$_\n" for @test1;
print "normal sort:\n";
print "\t$_\n" for @out0;
print "number-aware sort:\n";
print "\t$_\n" for @out1;

Output:

original:
        Test 2 ccc
        Test 1 aaa 1
        Test 1 aaa 10
        Test 1 aaa 2
        Test 10 bbb
normal sort:
        Test 1 aaa 1
        Test 1 aaa 10
        Test 1 aaa 2
        Test 10 bbb
        Test 2 ccc
number-aware sort:
        Test 1 aaa 1
        Test 1 aaa 2
        Test 1 aaa 10
        Test 2 ccc
        Test 10 bbb

Saturday, December 01, 2012

Regex to match the word under cursor

The VIM-specific regex below matches the word under cursor. (Pasting unmodified as it is in my vimrc to also match German letters.)

/[a-zA-Z0-9ßÄÜÖäüö]*\%#[a-zA-Z0-9ßÄÜÖäüö]*

Documentation is under ':h /\%#'

Example usage: enclose the word under cursor in 'em' tag. Best experience if that is triggered on a keyboard shortcut.

:s![a-zA-Z0-9ßÄÜÖäüö]*\%#[a-zA-Z0-9ßÄÜÖäüö]*!<em>\0</em>!

Negative side-effect: causes fancy behavior of a seemingly random word to be highlighted when 'set hls' is in effect.

Search for a misspelled word

Alternative 1:

/The\S\+les\(Themistokles\)\@<!

Alternative 2:

/\(Themistokles\)\@!\(\<The\S\+les\>\)

Both search for any word which starts with 'The' and ends with 'les', but is not 'Themistokles'.

[link] Wrap a visual selection in an HTML tag

Wrap a visual selection in an HTML tag.

Pretty useful function. I have only slightly modified it to take the tag as parameter and insert the tag on the line before/after selection. And hooked it on a keyboard shortcut.

[ The '^M' below should be converted there into real ^M (typed as ^V^M). ]

" Wrap visual selection in an HTML tag.
vmap <C-q> <Esc>:call VisualHTMLTagWrap('cite')<CR>
vmap <C-T> <Esc>:call VisualHTMLTagWrap('title')<CR>
function! VisualHTMLTagWrap(tag)
 normal `>
 if &selection == 'exclusive'
  exe "normal i^M</".a:tag.">"
 else
  exe "normal a^M</".a:tag.">"
 endif
 normal `<
 exe "normal i<".a:tag.">^M"
 normal `>
 normal j
endfunction

Folding something semi-automatically, on demand

Simple functions to fold in a file blocks which have beginning and ending markers.

Since custom folding functions can cause VIM's performance to degrade, the trick is: after applying the folding, disable it immediately back. For that work I found that I have to call 'redraw' before disabling the 'foldmethod=expr'.

The snippet below folds all lines enclosed between '<binary' and '</binary>'.

function! FoldWhateverFunc(mstart,mend,ln)
 let t = getline(a:ln)
 if t =~ a:mstart
  return '>1'
 elseif t =~ a:mend
  return '<1'
 endif
 return '='
endfunction

function! FoldWhatever()
 set foldexpr=FoldWhateverFunc('<binary','</binary>',v:lnum)
 set foldmethod=expr
 redraw
 set foldmethod=manual
endfunction

Hint: one can replace the hardcoded 'binary' tag with call to the 'input()' function. Though I prefer non-interactive approach, something I can plug into the ':au'.

Edit1 BTW ':h fold-expr' contains several useful one-line examples of folding expressions.