Sunday, February 23, 2014

VMware ESXi: the unexpected effects of swapping boot keys,

It all happened because of booting ESXi from an USB flash. 
Sometimes things go on the entropy side in my lab and a host is booted from the wrong stick, in that case ESXi interprets this as a new installation (you notice it because it demands to insert the license code.

The standard behavior of ESXi is to create a persistent scratch space whenever the host is booted for the first time.
If the host uses previously formatted volumes (as is the case for me) it will lock the first disk device it finds because it uses it for the scratch partition.

The result is that the entire device is locked (no more vmfs partitions can be created) and the existing datastore cannot be deleted, indeed trying will return the following error:

"Cannot remove datastore ‘Datastore Name: * VMFS uuid: *’ because file system is busy. Correct the problem and retry the operation."

In this case move the scratch space somewhere else, fortunately it's quite easy to do that albeit it requires to reboot the ESXi host.

Where is the scratch space?
On the ESXi console (read here on how to activate it and the SSH service) check the content of:

/etc/vmware/locker.conf

The file should render something similar as:

/vmfs/volumes/510eb9e2-3772226f-db9b-001aa032346a/.locker 0

The UUID of yours would be obviously different, check if it corresponds to the datastore to unmount:

esxcli storage filesystem list

or use:

df

Filesystem        Bytes       Used    Available Use% Mounted on
VMFS-5     343328948224 1026555904 342302392320   0% /vmfs/volumes/DS_1
vfat          261853184  136474624    125378560  52% /vmfs/volumes/059b62ef-4b7ea870-d01b-603d7ae96396
vfat          261853184       8192    261844992   0% /vmfs/volumes/6e4e0fef-ed0fc28a-15d1-d8dcc0b12819
vfat          299712512  211755008     87957504  71% /vmfs/volumes/51179227-ea8b1764-c1c1-001aa0322571

To check all mounted volumes and their UUIDs.

Then create a partition for an alternative persistent scratch space (I crete it on the USB key itself, with a 4GB key there is enough space for most situations). In our example it is the partition with the UUID: 6e4e0fef-ed0… Different systems different UIDs.

mkdir /vmfs/volumes/6e4e0fef-ed0fc28a-15d1-d8dcc0b12819/.locker-Hostname

N.B.: Scratch partitions can reside on FAT, VMFS and NFS file systems. The new scratch directory can be created in the vSphere client as well, just use the Storage browser provided.

Now, either use the vSphere client to:
Choose the ESXi host
Configuration Tab -> Software -> Advanced Settings
ScratchConfig -> copy the full path to the directory you just created.
Reboot the ESXi host

Or use the ESXi CLI this way:

vim-cmd hostsvc/advopt/update ScratchConfig.ConfiguredScratchLocation string /vmfs/volumes/6e4e0fef-ed0fc28a-15d1-d8dcc0b12819/.locker-Hostname

and reboot the ESXi host.

Done! The old datastore can now be deleted and the disk is unlocked.

Side note: what is the scratch space?
It is a persistent location available for storing temporary data including logs, diagnostic information, and system swap. It is not a required feature, in fact ESXi can store the data on ramdisk for the time it's runnig, it is however a best practice not to consume memory for such tasks and to be able to recovery logs across reboots.

Sunday, October 14, 2012

About caves and NP-Complete algorithms

Gouffre Berger (France)
I believe myself to be hardened against tech interviews; however, once in a while, I discover myself being mistaken.

Some time ago, during a job interview, I was asked to solve a classic puzzle (Later, when at home, I discovered it being known as the “Four men on a rickety bridge” puzzle).

Here's is the interview version:
Four speleologists are left with one torch in a cave; the exit is a long, dark corridor, so tight that only two people at once can walk through it.
Not all speleologists can walk at the same speed; it takes them respectively 1, 2, 4 and 7 minutes to cover the distance to freedom.
The torch must be walked back and forth by one of the people since the corridor is really dangerous. It’s cold, the speleologists are tired and the torch battery’s is running out.
You are urged to find the fastest way to get everybody to safety.

As many do, I initially thought to pivot the fastest person and let him guide all his friends out by carrying the torch. 
The total time would be then:
  • 7 and 1 get out, total time: 7 
  • 1 goes back, total time: 1 
  • 4 and 1 go out, total time: 4 
  • 1 goes back, total time: 1 
  • 2 and 1 get out, total time: 2 
  •  Total time: 6 + 1 + 4 + 1 +2 = 15 Right?
Wrong!

The trick is to have the slowest and the fastest people get out together:
  • 2 and 1 get out, total time: 2 
  • 2 goes back, total time: 2 
  • 7 and 4 get out, total time: 7 
  •  1 goes back, total time: 1 
  • 2 and 1 get out, total time: 2 
  • Total time: 2 + 2 + 7 + 1 + 2 = 14 

Non-geeks readers can stop here, we talk about NP-Completeness and euristic.

As a side note, this particular set might be solved using a greedy algorithm by grouping walking times and finding each time the minimum of the previous resulting subsets.
For a subset of 2 people walking in the tunnel, given n speleologists {1..n} with walking times t(1) < t(2) < … < t(n)

Total time: t(1) + t(n) + Min(2*t(2), t(1) + t(n-1))

However such an approach might not converge for a generalized case, you need dynamic programming and backtracking to guarantee a solution that works with every number of people and different subsets.

Thursday, October 11, 2012

Are LinkedIn Skills Endorsement diluting the value of our profiles?

Unexpected Skills
I am in two minds about the LinkedIn skills endorsement.

My anecdotal experience is that it ignites the “Click-me-I-click-you” mechanism that is common on Twitter or Facebook: people pay back the favour by endorsing those who endorsed them.
The risk is that we end up being labelled with our most public skills; for instance, I seem to be a Cloud expert (notably because I have been vocal about this since 2007) while the most substantial part of my job is all about IT Transformation... Ok there's a part of Cloud Computing in it but it's not that relevant.

In a certain sense, Skills Endorsements could be diminishing the value of our profiles by streamlining them to the most discussed ones. They miss the point in creating a holistic view of our experience; a short, non scientific profile-search on my contacts shows that less than 10% of these people have been exposed to some of my very own strategic skills like IoT.

If this really takes off, either it’s time to trim the fat off LinkedIn contacts, or the Skills Endorsement should be targeted at specific communities.