Archive for May, 2012
Hyper-v 3 could be released somewhere around Q3 and some of VMware’s enthusiast already started with their own prediction on the future release of vSphere. You can read one of the interesting post regarding this prediction here at up2v and if this is true, I think we should thanks to Microsoft for playing their part in making vSphere a better product. As we know and undeniable, Fault Tolerance for VM with multiple vcpu is one of the feature that we have being waiting for so long and if Hyper-v 3 can push VMware to make it possible, I’m more than happy with it.
p/s: Am also hoping that new vCenter will able to manage Hyper-v host
Due to my another new project is coming, I did some testing this morning in order to get familiar with port binding configuration for iscsi in ESXi 5.0. Amazingly, you can do this through GUI instead #esxcli command like what we did previously on ESX 4.x. Unfortunately, I don’t have enough hardware to show Jumbo Frames configuration in this tutorial. Nevertheless, I hope you will enjoy the video.
- 1x vCenter + ESXi 5.0 VM (nested)
- 1x iscsi server (win2k3 64bit + starwind)
- 2x vSwitch (Mgmt + iscsi)
- 2x vmk for iscsi in the same vSwitch and ip range
- vmk1 = vmnic1/active and vmnic2/unused
- vmk2 = vmnic2/active and vmnic1/unused
In this SRM 5 + vReplication Part 2 tutorial, you will learn how to configure pairing between two different sites. The most challenge part is, during pairing you may receive “TCP connection port 80″ error which may results your pairing sites failed. But as long you have a correct fqdn between sites, I’m 100% guarantee that you will never have this error.
Some of you may have disconnection issue for your SRM with error message “command takes too long to respond”. To solve this, make sure your windows services called “workstation” is up and running.
In this tutorial I will show you the basic installation and configuration of SRM 5 + vReplication. This test lab actually was designed for my next project assignment and because this is going to be my first project for vReplication, apologies if some of the video contents are bit confusing. Trust me, the installation is pretty straight forward and some minor issue that you will face during installation can be avoided too. The most important thing that you may have to give more concentration is the design. A meticulous design actually can prevent you from missing any important information required during implementation stage and that’s why during this lab, I had recorded all parameters required for the configuration.
Requirement for this part 1 (SRM installation) tutorial as follow (Both sites):
- SRM VM (Win2k3 64bit, 2x vcpu, 2GB memory)
- SRM Database (in this case, I chose remote SQL)
- 32bit System DSN for SRM
- vCenter 5.0 U1 (same VM as for SRM)
- fqdn for vCenter
SRM 5 + vReplication Part1
Last week we had to spent more than 3 hours to troubleshoot a “purple screen” issue during ESXi 5.0 U1 installation onto IBM x3650 M2. The installer can be booted successfully but once it reached the stage where we need to press “[Enter]” to proceed with the installation, the display suddenly turned to purple color without any warning or what so ever. This is not a known PSOD which normally happen due to some hardware related issue. It was just something different. Sigh!
Somehow, I’ve found a related post in VMware’s community and one of the forumer had suggested to change the on-board display setting in server’s bios. Why? It is because the display mode for ESXi 5.0 has been changed to 80*40. Without delay we arranged another around of testing on one of the server with our customer. You know what, as per my colleague (Fahmi), there wasn’t any on-board display setting in the bios, but he did noticed that a legacy display has been enabled instead. So once he force to disabled it, boot the installer again and finally the installation now can be proceeded successfully.
Like I have mentioned in my previous post, “as a human, we tend to blame others first”. I don’t know whether this is coincidence and just happened to me or this is what a storage guys normally do when their storage having some problem “everything is okay on the storage. Check with VMware”. I don’t blame them for being defensive since they need to cover up their ass too. But, please listen to others opinion first and don’t foolish yourself in front of the customer later. I may look stupid in front of you, but once I managed to find the root cause which I mostly did, you will die brother!.
I’ve remembered in year 2009 when there was a big case (HP Eva storage totally went down) happened to one of my enterprise customer. HP simply claimed that their storage was okay, no error or what so ever on their EVA, though ESX hosts had lost connection with the datastore one by one. Of course the problem actually was not with the EVA but with the SAN switch (zoning configuration gone due to firmware related issue). What they forgot was, the SAN switch also been configured by them and after tons of denial, they have to admit that this is not a VMware issue.
My second case was a month ago when one of datastore presented by Falconstor’s NSS gone unexpectedly. Could be the datastore has been formatted or accidentally deleted. As expected, the storage guy also claimed nothing wrong with their storage though from VMkernel logs I have found out that someone did a snapshot restored from NSS on the previous night. You wouldn’t do a restore if nothing wrong with your storage do you? This is the longest and the most comprehensive incident’s report that has been prepared by me in order to proved that, the problem was nothing to do with the ESX host.
My last case was happened last night. As usual, the IBM’s guy also claimed everything was normal which I agreed. He asked me to reboot all hosts, check here and there and tried to blame VMware instead. The only error that I found in IBM storage manager was the LUN currently owned not by a preferred path. This meant that, there was a failover occurred on the storage controller which supposed to be an expected behavior when one of the path failed due to some reasons. But what IBM failed to realized was, the redundant path for the second controller configured by them wasn’t working and never been tested and this has been confirmed by my customer too.
So, the above cases really teach me how troublesome it can be when dealing with the storage guys and believe me, being cooperative with others while having difficulty is not an option to them (typical Asian’s culture). Though not all but this is the actual fact which everyone will have to face it.
It’s an easy task but believe me, few times admin accidentally deleted their vmdk for a simple reason. There was a case today when my customer accidentally deleted his production vmdk while doing some clean up in the inventory. His intention was to delete a ghost VM without knowing that one of the vmdk in this ghost VM currently being attached and actively used by one of the production VM. The result was pretty obvious, within a second the production VM became grey and inaccessible. Luckily, flat-vmdk normally will not be deleted for powered-on VM but not for the descriptor file (.vmdk) and to recover the disk, what we have to do is to rebuild the vmdk descriptor file for the deleted vmdk.
- Create a new virtual disk (eg. test-flat.vmdk and test.vmdk) with the same size of the deleted one (eg. braingain.vmdk),
- Delete new virtual disk flat file (test-flat.vmdk),
- Rename and edit descriptor file (test.vmdk to braingain.vmdk)
- Create new VM and select to use existing disk (braingain.vmdk)
- Power on VM