2019年2月3日 星期日

Using command line tools proactively


I still remember the day that I firstly saw the books like "the Linux command manual". Most of them are just (or similar to) the collection of the contents from the "man" command output. I was surprised at "How a person could memorize or know so many commands and their usage." There are hundreds of commands, and thousands of their corresponding options and usages.

Many people, at least myself, figure out which command we should know and how to use it by googling a specific question. For example, google "How do I list files in a directory with Ubuntu Linux", and pick up (randomly) the first few searching results to follow. As the time goes by, I know more and more commands until I could handle most of the issues in my life.

Then the learning curve reach a plateau. I won't learn new command or new option of a known command for longer and longer period.

"How the people on the Stack Overflow know so many variant ways or commands to handle a similar problem?" I seldom (and can't most of the time) walkthrough the manual of a command, and I suppose many people don't as well. For example, the "dd" command has a lot of fancy and useful options and features to use, but the default value should work like a charm in more than 80% of life problems. If I don't know there is such a option or an extra feature to use, how am I aware that I could use them?

Besides from randomly googling and waiting for someone's answer on Stack Overflow, there is another way to think outside the box: think proactively.

To think proactively here means the following points of view have been considered in our mind:


  • What problem I am going to resolve?
  • What's the essential property of this problem? Does this property has similar problems as well?
  • What feature this tool should has to resolve this problem essentially?
  • What kind of the tool to resolve this problem may be? How this kind of tool to resolve the problem?
  • What is the result after I applied the tool to the problem?

Let's take the "dd" command as an example again. Assume our problem is "to clone one disk". Then the essential property of this problem could be:

  • How to clone this disk faster? - Is there any option to make it faster?
  • How to clone the disk reliably? - Is there any option for me to check the status frequently?
  • It is an I/O problem - There are very likely to be input and output related features.
  • It is an operation on block device - I have to think form the block device point of view.

And then it could be:

  • Speed: what is the potential features to make the read/wirte IO of block devices faster? - read/wirte chunk size. Error handling.
  • Reliability - Is there any process status reporter? Is there any error handler?
So I might figure out the "bs" option may play a role for the speed, and there are "sync" and "noerrors". I could suppose there should be a option for progress. It is status=progress then.



This mind is similar to the mind when trying to find out an solution to a problem, answer to a question, or debug code. Essentially they are just to figure out the goal, collect the associated 
information, apply it and review the result consciously.

For example, I am working on SGE (Sun Grid Engine) infrastructure recently. I was prototyping a solution to build the infrastructure automatically with LXD/LXC. When I complete the prototype I move the solution Juju/Charms to MaaS. I was blocked then. However I could soon find the root cause out by thinking proactively, like:

  • The error looks like a network issue and permission issue.
  • To build a SGE scheduler is a question about communication between nodes.
  • When I setup the prototype successfully, which part relates communication/networking/permission.
  • Is the same step applied to the new infrastructure flow?
Then I could understand what kind of commands (qconf for example) and the associated options I may look into. : )


In conclusion, when you have some background knowledge already, try to think "the tool that I has known would be great if it could has this feature. Does it have this feature?" rather than "google the problem directly." To google a problem on the internet you could most of the time just get an entry level answer, none, or noise.


2018年11月21日 星期三

Modify casper/initrd of Ubuntu 18.10 Cosmic Cuttlefish


The article is firstly posted here https://askubuntu.com/questions/1094854/how-to-modify-initrd-initial-ramdisk-of-ubuntu-18-10-cosmic-cuttlefish/1094855 because this change is pretty new and it seems that nobody has asked on the internet. To post there should help many people in the follow months after 18.10 release.


Besides, the quote from Debian wiki is also useful as background knowledge.

  • If an uncompressed cpio archive exists at the start of the initramfs, extract and load the microcode from it to CPU.
  • If an uncompressed cpio archive exists at the start of the initramfs, skip that and set the rest of file as the basic initramfs. Otherwise, treat the whole initramfs as the basic initramfs.
  • unpack the basic initramfs by treating it as compressed (currently gzipped) cpio archive into a RAM-based disk.
  • mount and use the RAM-based disk as the initial root filesystem.

2018年10月11日 星期四

Troubleshooting - curtin version is incorrect on a MaaS region server

Few weeks ago an weird MaaS issue happened to me. When I tried to commission or deploy node with ga-18.04 kernel. The deployment cycle always stops at the grub entry, which shows "Commissioning".

After fighting for few days by stopping in the ephemeral environment when dd the customized image to the hard disk. I noticed that the well-functioned MaaS region server updates the kernel in the ephemeral environment, and the malfunctioned MaaS region server doesn't. To use the new kernel is very important for me to deploy my customized images because I need nls_iso8859-1.ko module to deal with my recovery partition. This code snippet shows how a recent curtin (18.1) updates the kernel


ubuntu@breckenridge-dvt2-201802-26115:/curtin$ grep linux-image -r *
Binary file curtin/deps/pycache/init.cpython-36.pyc matches
curtin/deps/init.py: # linux-image package for this environment
curtin/deps/init.py: kernel_pkg = 'linux-image-%s' % os.uname()[2]
def check_kernel_modules(modules=None):

if modules is None:
modules = REQUIRED_KERNEL_MODULES

# if we're missing any modules, install the full
# linux-image package for this environment
for kmod in modules:
try:
subp(['modinfo', '--filename', kmod], capture=True)
except ProcessExecutionError:
kernel_pkg = 'linux-image-%s' % os.uname()[2]
return [MissingDeps('missing kernel module %s' % kmod, kernel_pkg)]

return [] 


Thus I went to dig in curtin, which takes care of the installation/dd of images, and noticed the version of curtin differs in two different MaaS region server which are installed the same version of MaaS. By updating the curtin I fixed the issue. The mulfunctioned one uses 0.1.0 curtin, and the good server uses 18.1.

In conclusion, the curtin version of the MaaS region server matters. It seems that the curtin will map into the ephemeral environment and be leveraged. Interesting!


Summary of the Debugging Tips




  • Summary of the debugging flow of this case
    • stop at the grub entry
    • check the previous stage and found errors in curtin stage
    • compare good and bad environment to use curtin (ephemeral environment)
    • identified the root cause is lack of nls_iso8859-1.ko
    • notice good environment updates its kernel
    • figure out the curtin source differs
    • found the curtin version differs
  • curtin log is valuable. Read it carefully. Check if it triggers the very first error.
  • Effective Debugging: 66 Specific Ways to Debug Software and Systems by Diomidis Spinellis suggests to compare the buggy system with a well-functioned system may help. So true!






2018年9月24日 星期一

How "source activate conda-virtual-environment" works?


When using conda of Anaconda  or Mini-conda to create and manage a Python virtual environment, this kind of command is commonly used to activate and deactivate the target virtual environment:
source activate <conda-virtual-environment-name>
How does this work? Firstly we need to know:

  • source is a feature of bash shell. It is equivalent to . (a dot) of dash shell.
    • bash manual page says
      • ... filenames in PATH are used to find the directory containing filename. ...
If you could use the command, conda, your conda bin folder must be included in the  environment variable PATH to make the command conda available to be searched and used. If you go to the same bin folder of conda path, you could see the files, activate and deactivate are in the same folder.


Thus, the command, source activate conda-virtual-environment, is actually

source <path-to-conda-bin-folder>/activate <conda-virtual-environment-name>

<conda-virtual-environment-name> is just the argument of the executable file, activate. Read the file activate would help you to understand how the virtual environment is launched/activated.

By the way, the recent conda is going to use conda activate to replace the conventional source activate.




2018年5月14日 星期一

Installer nightmare

To develop an installer or debug it could be very challenging for the sake of limited resource. Besides, long turn around time is another big big challenge as well. It could be a nightmare.

The limited resource here means:

  • You have no idea where the log will be.
  • You may not fetch the log you want.
  • You don't manage to access the log even you know where it is.


The long turn around time here means:


  • You can't reproduce the breakpoint within 5 minutes because you have to restart the machine and wait for image-level copying.


I will take an example below to elaborate the essence of an installer challenge. LAVA is a tool for debian, and I will talk about Ubuntu.


Ubuntu Desktop installer


To develop Ubuntu Desktop installer on a real machine in OEM mode. I often turn on debug mode by injecting debug parameters in the kernel parameter line and go to /var/log/installer/debug. To hardcode the frequently used parameters in the bootloader (say grub.cfg of grub) may be a good idea, because it takes much attention to input the parameters. The following parameters are the ones I used very much:


  • debug -- automatic-oem-config debug

I tweak casper/filesystem.squashfs sometime as well to dump more special messages at the stage 1 of recovery. Besides, to install useful tools by choort/dpkg may be a good idea as well. A better text editor and the ability to ssh connect remotely may help me to interact with the installer and monitor the log in run time.

To leverage tweaking squashfs is useful, however it also pays off. A typical Ubuntu desktop squashfs could be 1 ~ 2G. If you are not using solid state disks, it takes a log of time to extract the squashfs, modify it, and then re-pack it back to a squashfs. Frequent flow looks like:
  • sudo unsquashfs -d ./fs filesystem.squashfs (extract files)
  • sudo mksquashfs ./fs/ filesystem.squashfs.mod (pack modified files)
  • sudo cp filesystem.squashfs.01 ./<somewhere of your installing media>/casper/filesystem.squashfs (deploy)

An auxiliary could be an ftp server to download debug tools.


2018年4月23日 星期一

Make a unattended installation of Ubuntu server image

In this post we see how I investigated to find the target preseed file matches my requirement. This post shows the step-by-step commands in summary.

Firstly let's fetch the iso contents from the original iso image.

$ sudo mount -o loop ~/Desktop/images/ubuntu-16.04.1-server-amd64.iso ./160401-base-iso/
mount: /dev/loop6 is write-protected, mounting read-only
$ cp -rT ./160401-base-iso/ ./160401-target-iso/
$ sudo umount ~/Desktop/images/ubuntu-16.04.1-server-amd64.iso

Then let's tweak isolinux, which is used to boot the system at the very first time, a bit:

$ chmod u+w 160401-target-iso/isolinux/txt.cfg
$ vi 160401-target-iso/isolinux/txt.cfg

You may want to use the txt.cfg directly from here https://gist.github.com/tai271828/cbe426c158c68ae8f51a18b0ad26af52#file-txt-cfg

and

default install
label install
  menu label ^Install Ubuntu Server
  kernel /install/vmlinuz
  append  file=/cdrom/preseed/unattended-ubuntu-server.seed vga=788 initrd=/install/initrd.gz quiet languagechooser/language-name=English debian-installer/locale=en_US keyboard-configuration/layoutcode=us ---

You may also want to use the isolinux.cfg here as well https://gist.github.com/tai271828/cbe426c158c68ae8f51a18b0ad26af52#file-isolinux-cfg


$ chmod u+w 160401-target-iso/isolinux/isolinux.cfg
$ vi 160401-target-iso/isolinux/isolinux.cfg

set timeout as 1 (or any positive integer. 0 means waiting forever)


Lastly let's put the preseed, unattended-ubuntu-server.seed
,  pointed by txt.cfg. Please check the details of the preseed file here: https://gist.github.com/tai271828/cbe426c158c68ae8f51a18b0ad26af52#file-unattended-ubuntu-server-seed

Everything is ready! Let's generate the iso. Under the target iso folder

sudo mkisofs -r -V "UBUNTU160401" -cache-inodes -J -l -b isolinux/isolinux.bin -c isolinux/boot.cat -no-emul-boot -boot-load-size 4 -boot-info-table -o ~/Desktop/images/ubuntu-16.04.1-server-amd64-autoinstall.iso .
The above steps are wrapped up here https://gist.github.com/tai271828/cbe426c158c68ae8f51a18b0ad26af52#file-prepare-image-sh Tweak it to match your file hierarchy.

PS Ubuntu server by default has no tty7 (used for X to provide GUI conventionally) and use tty1.

What is Next...

The final product is an iso. You could boot it from virt-manager easily by using it as a virtual CD-ROM disk. However, in the modern world, we use live USB much more often now. To make a bootable live USB of this iso, you may achieve the goal by


sudo isohybrid ubuntu-16.04.1-server-amd64-autoinstall.iso
sudo dd bs=4M if=./ubuntu-16.04.1-server-amd64-autoinstall.iso of=/dev/sdX conv=fdatasync


How and why isohubrid works you could refer to my another blog post in Chinese http://zh-tw-tai271828.blogspot.tw/2017/07/hack-iso-usb-isohybrid.html


2018年4月22日 星期日

To investigation process to make a unattended installation of Ubuntu server image

If you are trying to find a step-by-step solution post, this post is NOT for you. Please go to here to have a step-by-step solution http://tai271828.blogspot.tw/2018/04/make-unattended-installation-of-ubuntu.html .


The longevous and respectable installer, debian-installer (d-i), provides powerful feature to customize and automate your installation by preseed, which is a configuration file to answer the questions of the installation prompt. The core question is: how do I know which question regarding which prompt, and what answer is acceptable or understandable by d-i?


If you check the d-i manual to try to answer the above question, you will find (1) the document enlists basic question-answers grouping by features (2) the groups collect basic description and may not elaborate the details for each question-answer item (3) you may customize your installation but you don't find the question-answer matches your requirement.

To overcome the lack of information, I did

(1) (default, RTFM ;) ) check the manual.[1]
(2) search example preseed file developed by others. google or check others' open source projects.
(3) find a workable benchmark. (debconf-get-selections is a useful and promising solution)


Regarding (1), I read the manual in this way very often: (a) confirm my goal ("What question I want to solve? Describe it in technical action item words.") (b) imagine what the solution may look like. What the design of the solution may be. (c) find the possible design from the manual (d) if nothing was found, skim the sub-titles of the  manual, and go back to (a)(b)(c). Read the document line-by-line is the final action which may or may not be considered to adapt.

Regarding (2), google may be useful (google "unattended ubuntu server preseed github", for example) or NOT. Targeting on open source projects and have a look of their source is more efficient in my experience, especially the project is still a working and alive project.

Regarding (3), this skill is also suggested by Effective Debugging: 66 Specific Ways to Debug Software and Systems: Item 5 Find the Difference between a Known good System and a Failing One [2]. To create the benchmark of a good system (with working answers to the questions), debconf-get-selections is a tool to dump the contents of the debconf database[3], which contains the question-answers in your system. You may want to append --installer when using debconf-get-selection to fetch the question-answers of installation stage.

A usual way looks like this:


  1. Prepare a working system installed manually and answered all prompted as your wish. VM may be a good choice.
  2. Login the working system. Dump the question-answers by debconf-get-selections --installer
  3. Compare the question-answers or find possible question-answers against to the prompt you want to bypass automatically.

This is a bit trial-and-error flow. Setup an easy flow to repeat will be helpful very much. 





[1] Usually Appendix B. is recommended https://www.debian.org/releases/stable/amd64/apb.html.en

[2] https://books.google.com.tw/books?id=Fa6JDAAAQBAJ&pg=PT125&lpg=PT125&dq=effective+debugging+titles&source=bl&ots=moKocciySF&sig=HuoF5Dl3mJBoGf84YUXix-Hk5ic&hl=en&sa=X&ved=2ahUKEwjFwajc3c3aAhXNNpQKHa6dDocQ6AEwA3oECAAQSQ#v=onepage&q=effective%20debugging%20titles&f=false

[3] By default there is not debconf-get-selections, you may need to install it by "apt-get install debconf-utils" to get it.