Skip to content

zip2dir (expand a zip to a directory of the same name)

IT

I needed to expand a lot of jars (Java zips) and other zips of various names into directories of the same name for each file. With 6,239 files of which some are jars, some other zips and many xml and other filetypes all not properly identified by a file extension, this gets a bit too much to do manually.

So:
Finding candidates for these is easy with find . -type f.
The file is most probably a zip archive if the first two characters are "PK", good old Phil Katz' signature. A friendly head -c 2 checks that.
All combined with some rudimentary error checking:

  1. #!/bin/bash
  2. # There is little data security here, so know what you're doing.
  3. # All risks in using this code are yours. It moves and deletes files quite stupidly.
  4. # (c) Daniel Lange, 2009, v0.01, released into the public domain
  5. if [ $# -ne 1 ] ; then
  6.         echo "Error: $0 expects exactly one argument, a (fully qualified) path/to/a/zipfile"
  7.         exit 1
  8. fi
  9. if [ ! -r $1 ] ; then
  10.         echo "Error: file does not exist or no read permission on $1"
  11.         exit 2
  12. fi
  13. if [ ! -w "$(dirname $1)" ] ; then
  14.         echo "Error: cannot write to directory of $1"
  15.         exit 3
  16. fi
  17. if [ "$(head -c 2 $1)" == "PK" ] ; then
  18.         mv $1 $1.tmp
  19.         mkdir -p $1
  20.         unzip -d $1 $1.tmp
  21.         rm $1.tmp
  22. else echo "$1 is not a zipfile"
  23. fi

Download available here (1KB).

Typical usage:

 find . -type f -print0 |xargs --null -n 1 zip2dir

This will expand all zips under the current directory.
Leave the zip2dir out for a dry run (xargs will just print to the tty then). Look at the -exec switch when digging around a bit more into what find can do for you.

Google GMail dominating the email market

Other

Google's GMail was launched in April 2004 and only in February 2007 Google dropped its invite system to open up to the general public acc. to Wikipedia's history of GMail. That's some five years of operations up to now.

It kind of amazed me how many people I know have GMail as their primary mail provider. So I took the chance today to get a bit of statistics to check my gut feelings:

A friend of mine selected some (mostly American) bloggers that have indicated specific interests in a topic related to his Doctoral thesis. This sample ended up to be 1,375 people. These folks have 295 different email domains. Only.

A whooping 46% of the (rather random) sample use GMail, 12% Yahoo, 8% Hotmail and about 3% AOL. While Yahoo has some foreign domains in the sample (yahoo.co.uk, yahoo.ca, see mostly American bloggers above), these add up to around 0.1% of the sample so it's not really significant.

Distribution of American blogger's email domains

This data is in no way representative, but still wow. Google basically has a monopoly on search and now seems to have a close-to-majority footprint in personal email.

I guess the dominance is currently larger in the States than in Europe or Asia as GMail has only gradually learned languages beyond English.
Large local providers should also have some foothold in these markets. Similar to the Comcast and SBC customers still significant in sample depicted above. Just the local providers in Europe and Asia will be somewhat stronger (for now). Google is also aggressively targeting corporations with hosted email and apps now so one can expect further and accelerated growth in that area. Quite a number of companies are considering using hosted email instead of the conventional mail system they have operated on site for many years now.

So while Gina Trapani recommends "Break Google's Monopoly on Your Data: Switch to Yahoo Search", may I humbly point out: It's becoming quite impossible to just keep your emails between the recipient and the addressee these days.

Even if you personally do not use GMail, Google can (technically) still profile you because a huge chunk of people you communicate with send from GMail and receive and store your emails there.

Nearly all email that is sent also passes spam filters before delivery. Google bought the Postini spam filter in 2007. That anti-spam service is used by many enterprises and even city governments, see here.

So time to consider (unencrypted) email as what it has always been: The digital equivalent of a postcard.
Just now Google has become the postmen. All of them, every second shift. You should hope they're not nosey. Or send letters.

Update:

11.05.2014: Benjamin Mako Hill has written a blog entry Google Has Most of My Email Because It Has All of Yours doing analysis for his own email box. He found a third of his inbox emails come from Google and - as he doesn't usually reply to newsletters and the like - more than half of his own email replies (57% in 2013) end up at GMail. He published his code in case you want to do the analysis on our own email.

Fixing FreeNX / NoMachine NX keyboard glitches (e.g. ALTGr)

Linux

There is a add-on technology to X or VNC called NX by an Italian company called NoMachine. It's quite useful as it speeds up working on remote desktops via slow network connections (i.e. DSL pipes) substantially.

The libraries that implement NX are released under GPLv2 by that company. A server wrapping up the libraries' functionality is available as closed source from NoMachine or as a free product (GPLv2 again) by Fabian Franz, called FreeNX.

FreeNX itself is amazing as it is written in BASH (with a few helper functions in C). It's also able to mend some of the shortcomings of the NX architecture. E.g. stock NX requires a technical user called "nx" to able to ssh into the NX server with a public/private keypair. FreeNX can work around that for more secure set-ups.

One issue I bumped into quite regularly with Linux clients and Linux hosts from different distributions/localisations is that the keymaps are not compatible. This usually results in the ALTGr key not usable, so German keyboard users can't enter a pipe ("|"), tilde ("~") or a backslash ("\") character. Also the up and down keys are usually resulting in weird characters being pasted to the shell. Now all of that makes using a shell/terminal prompt quite interesting.

Continue reading "Fixing FreeNX / NoMachine NX keyboard glitches (e.g. ALTGr)"

Fix Umlauts in the XFCE Terminal

IT

The XFCE Terminal has the weird issue of sometimes showing question marks (?) instead of German Umlauts (äöüÄÖÜ) although they work fine in any other stock XFCE application (e.g. the default editor "mousepad").

The solution to this can be found on the XFCE Forums but it took me quite some time to find it. It was difficult to find a suitable search query to dig out that page. Google turns up a lot of irrelevant stuff on "XFCE Terminal question marks"...

XFCE Editor Umlauts with and without LANG variable set

The problem with Umlauts (and other 8bit ASCII characters) showing as question marks arises if the user has no LANG variable set.

A simple

export LANG=en_US

resolves the issue. Put that into ~/.bashrc or any other place suitable in your distribution.

Gentoo users may want to

su  # become root
echo "LANG=en_US" >> /etc/env.d/02locale
env-update
exit
source /etc/profile

to set the LANG variable system-wide.

So keywords, dear Google: Umlaute, deutsch, Fragezeichen, kaputt, falsch, broken, display, zeigt, charset, Zeichensatz :-)

Getting dual-screen (xinerama) to work with Matrox G450/550 graphics cards and Xorg 1.5

Gentoo

Gentoo finally decided to update Xorg to 1.5. Because this has very substantial changes against the previous version, some things break and there is a migration guide that you are nagged to read. After the upgrade I found that the Matrox card in one of my servers would not display xinerama anymore, i.e. I would get the same image on both screens only. This is the default behaviour for the stock Xorg mga driver. It needs a proprietary HALlib to get real dual-screen capabilities. Whilst there are a few unstable ebuilds for x11-drivers/xf86-video-mga none worked for me any better with Xinerama. The Gentoo Changelog is useless as usual. (Gentoo ebuild ChangeLogs tend to never really tell what is fixed, if you're lucky they reference a bug with a good description. But that's only if you're really lucky.)

Worse, that driver hasn't been updated by Matrox anymore since mammals took over the earth (figuratively ... 2005). This is the typical unmaintained-closed-source-drivers-make-hardware-obsolete-sooner-than-later story. Luckily the cards are quite widely used and clever people from the Open Source community have written guides (Tuxx-Home, Fkung) on how to dissect the proprietary driver and combine parts of it with the Open Source version so that it can be linked into recent X servers. Unfortunately because of the architectural changes in Xorg 1.5, following these guides will fail at the compile stage.

In the Matrox Forum of Alexander Griesser, the author of the first comprehensive Matrox driver install guide linked above, people currently mostly downgrade to previous Xorg versions to work around the issue.

But there is a better^Hworking solution already emerging :-P ...

Continue reading "Getting dual-screen (xinerama) to work with Matrox G450/550 graphics cards and Xorg 1.5"

Windows Vista dial-up networking slow to establish connection

IT

If you find that Microsoft Windows Vista is slow to establish a dial-up network connection (DUN) ("register with the network"), that may be caused by it trying to also get an IPv6 on a IPv4 only ISP. Remove the IPv6 protocol from the Properties -> Network tab of the DUN then. Worked for me on dialing into an ISP via Bluetooth / mobile phone. Ymmv.

Cool, command-line style blog design

Private

It's very seldom that a blog design catches my eye.

Screenshot of Pete Hindle's blog

Most common templates for blog systems like Wordpress or Serendipity are very well honed. Usability, accessibility and visual design of these systems and their default templates are are as good as it gets for the time being. Trying to do better usually fails. But Wordpress-CLI, which I found at Pete Hindle's blog manages to create a unique design. It may be inspired on the google shell (gosh) or older incarnations of the concept, e.g. WebCmd, but it is unique because it requires poking and trying stuff to expose the full functionality. A bit like an old school rogue-like game, it inspires playing with it to find out more. And it reminds nicely of command interfaces to BBS systems although the authors chose a syntax to resemble a unix based shell. You can try out different sub-designs at Rob McFarland's site. Obviously, usability still sucks, but it's worth it! Well done. And Pete: Please write some interesting entries in that blog now :-).

Wikimedia 2008/2009 Fundraiser Analysis I

Other

The Wikimedia Foundation has started its annual fundraiser again on November 4th 2008. It is scheduled to run until January 15th 2009. I've written several articles on last year's so I've been asked a lot to comment on the current one. This year Wikimedia clearly state they want to raise $6 million. No more diffuse targets, "number of donors" weirdness as last year. Well done, Rand!

Wikipedia main page shows $3.289.684
The Wikimedia fundraiser contributions page shows $2.289.684 - the Sloan Foundation million less

Rand Montoya is Wikimedia's new "Head of Community Giving" and responsible for making this fundraiser much more professional than last year's. The contribution history page still is only good to spot the occasional clown donating JPY 1 (which makes Wikimedia loose money due to transaction fees). But Rand pointed me to two other pages giving a better report on current donations:

The meter banner on top of many wikipedia pages matches the data on these pages quite closely. It's just a million off :-). (see images above) That million is an annual donation by the Sloan foundation ($3 million over three years). All of it been accounted for in the donation year 2007 (see the Wikimedia Financials FAQ) but somehow it is added again in the meter but not in the statistics pages. BTW: The German Wikimedia chapter has a less fancy, but more complete and transparent reporting of donations. Not cut-off after seven days. All of a month on one page. Data ready for copy & paste into a spreadsheet. Benchmark.

So, is the $6 million target realistic? At day 29 of the 74 day campaign, $3.3m have been raised. Taking out the major donations totaling $1.7m (these are not stochastic enough to be estimated with any reasonable validity) and assuming that the Christmas tax rally at the end of the year roughly equals the start of fundraising spike, we can expect $2.6m to be raised from individual contributors. Adding back the major donations, the estimation for the total fundraiser comes in at $4.3m. That's plus any major donations still to be announced. I bet a guy with Rand's experience has another ace up his sleeve. One less elegant solution has been hinted at in the fundraiser FAQ already: "What happens if you do not reach your goal? [...] A second, smaller fundraiser may be scheduled for March."

Converting a DVD film (mpeg2) to DV

Other

There are a gazillion web pages telling you how to convert DV to MPEG2 for DVD use. But I got a DVD from a corporate event and needed to convert it to DV to be cut in kdenlive. So just the other way around. Try to find a web page about that direction (needle in haystack, anyone?).

Giving up on google, I tried unsuccessfully with the swiss army knife that comes to mind first (ffmeg).

While something like ffmpeg -i vts_01_1.vob -i vts_01_2.vob -i vts_01_3.vob -sameq -target dv ../Raum_Video.avi creates a nice .avi, even mplayer complains about it violating the dv and avi standards.

So back to digging around in tutorials and forums and trial and error with other tools. Finally I found Avidemux to be the tool of choice. It encapsulates ffmpeg and other tools nicely to make them produce the expected results. Set video to DV (lavc), Audio to WAV PCM and the container format to AVI and go grab a coffee meal. It creates a nice DV file that you can easily work with in your favorite video editor.

Screenshot of Avidemux in action

httpdate - set local date and time from a web server

Linux

While ntp may be a great protocol, I find it quite bloated and slow for the simple purpose of just setting a local date and time to a reference clock. I do not need 20ms accuracy on a notebook's clock :-). Thus I use(d) rdate for a decade now but the public rdate servers are slowly dying out. So I'm replacing it more and more with htpdate which works quite nicely. It's written in C and a perl alternative is available on the author's site. There is also a forked windows version of it available.

Developing a bit larger bash script (which syncs a few servers), I wondered whether I could realize the time sync part in bash as well.

It's quite possible:

  1. # open a tcp connection to www.google.com
  2. exec 3<>/dev/tcp/www.google.com/80
  3. # say hello HTTP-style
  4. echo -e "GET / HTTP/1.0\n\n">&3
  5. # parse for a Date: line and with a bit of magic throw the date-string at the date command
  6. LC_ALL=C LANG=en date --rfc-2822 --utc -s "$(head <&3 | grep -i "Date: " | sed -e s/Date\:\ //I)"
  7. # close the tcp connection
  8. exec 3<&-

Simple, eh?

Continue reading "httpdate - set local date and time from a web server"

Disabling a group policy'd screensaver on Windows

IT

I guess many people know the issue of having a screen saver forced active after a some time through a group policy in a corporate environment. This is usually done to make sure systems are locked during breaks if people forget to press Win+L (or Ctrl+Alt+Del and then Enter). While that may well help IT security, it turns problematic when giving presentations for extended periods of time. Having to move the mouse through the presentation pointer every few minutes or dash back to the PC once the screen saver has kicked in, again, is simply annoying. On your company's systems you may be able to get the system admins to allow configuration of the interval or allow for disabling the screen saver, but on foreign systems you're often lost. But...

Continue reading "Disabling a group policy'd screensaver on Windows"

Freenode staff list

IRC

Donna "Sportchick" Crawford has put up a blog entry on the freenode staff blog listing the currently active 39 freenode staff members. Freenode is growing gradually towards 50.000 users, so we have quite a lot to do :-).

People readily available to help on very short notice are voiced (+v) in #freenode. Prefer to contact these whenever possible.

If none are voiced, just ask away in #freenode anyways. There are usually some staff reading and many questions can be answered by the channel regulars as well.

/who freenode/staff/*

will give you a list of currently online staff, people that have marked themselves away have a "G" (gone) in their who-line, folks that are there a "H" (here).

You can check when somebody has talked the last time by using whois with the nick appended twice, like

/whois JonathanD JonathanD

(yes, twice the nick!) and thus see who might be able to help for really private matters and who did just idle too long to be really near the keyboard.

kloeri announces Exherbo, another source based Linux distribution

Linux

Bryan Østergaard (aka kloeri) announced Exherbo today. He assembled a team of (ex-)Gentoo developers including Ciaran McCreesh (ciaranm), Richard Brown (rbrown), Fernando J. Pereda (ferdy) and Alexander Færøy (eroyf) to build a new source based Linux distribution.

They would like to overcome some of the short-commings of Gentoo both from a technical as well as from a community perspective. Obviously this is easily said and hard to really achieve, so time will tell how successful that team can be. Renaming USE-Flags to OPTIONS and merging the platform KEYWORDS (like x86, ~x86) into the Options-logic is no big deal, but getting the thousands of ebuilds^Hpackages better supported and maintained than Gentoo will be the real deal{maker|breaker}.

Paludis, ciaranm's package manager, supports Gentoo ebuilds and can import them into Exherbo, so there is a potential migration path sketched out.*

They also add another init-system re-write ("Genesis") to the pool. An already quite crowded pool with rather shallow water, I may add.

Exherbo has nothing that is end-user-safe at the time of the announcement, so it's safe to assume kloeri's team wants to attract further development capacity :-).

Browse around the website or join folks in #exherbo if you're interested.

I asked in #exherbo what "exherbo" means ... latin for "uproot" was the answer. How fitting.

Updates

*19.04.08: Two friendly folks wrote in to clarify that Paludis currently can only import Ebuild-builds into Exherbo via importare, i.e. take a Gentoo build result and package it for importing into the Exherbo system through Paludis.
23.05.08: Ciaranm wrote a blog entry how to get build results into Exherbo/Paludis via importare.

Seredipity default event_s9ymarkup plugin breaking URLs that contain underscores

Serendipity

The default Serendipity mark-up plugin (event_s9ymarkup) currently breaks URLs that contain underscores.

So

http://en.wikipedia.org/wiki/Statler_%26_Waldorf

will end up

http://en.wikipedia.org/wiki/Statler</u>%26_Waldorf

because of a faulty regex. Garvin Hicking does not really want to fix this. (See this s9y support forum article for arguments pro/contra fixing it). So if you encounter this problem, your options are:

  • replace _ in URLs with %5F (aka manually urlencode it)
  • remove the plugin or disable it
  • patch the plugin

Patching is basically changing

plugins/serendipity_event_s9ymarkup/serendipity_event_s9ymarkup.php:

$text = preg_replace('/\b_([\S ]+?)_\b/','<u>\1</u>',$text);

to

$text = preg_replace('/\ _([\S ]+?)_\ /',' <u>\1</u> ',$text);

If you want to be writing things like "Haha[lol]" (which I have no real use for ...), extend the "\ " with whatever you'd like to be o.k. to delimit bolded words beyond blanks. It should only be symbols that are not valid in URLs (so none of "$-_.+!*'()," which are all valid in URLs according to RFC 1738).

You may also want to consider replacing one underscore ("_") with two or more ("__") to make the detection, that you actually wanted to write bold text, more reliable.

Updated Greasemonkey script for Xing

Private

Xing has just updated it's image-thumbnails naming "algorithm" once again:

You'll now find thumbnails named /2f1127fe2.3144098_s2,2.jpg, so including another component like <comma><digit> added to the name. Thus the Greasemonkey script linked from my article Greasemonkey to enlarge Xing pictures needs to have it's main regex amended:

Change \_s(1|2|3)?\. to read \_s(1|2|3)?(,\d)?\. in three places in the script.

Or download an updated version here. I hope "louis" will update the version hosted at userscripts.org, too.

Updates

02.05.2008 As Xing adds multi-digit numbers to the thumbnails now (like /7553bd445.4550412_s2,10.jpg), you need to replace \d with \d+ in the above regex. The linked Greasemonkey script is updated.

16.11.2011 And again the URLs changed. This time appending a pattern like ,4.57x75.jpg. The linked script again was updated. If Scriptish and the auto-updating in Greasemonkey mature, I'll add auto-updating to the script file via the @updateURL parameter.

13.05.2012 The Xing layout is still changing as they Ajaxify the site more and more. Currently the Xing XE is probably the best working enhancement to see Xing pictures in a recognisable size.