Fedora Core 11 upgrade notes

See some specific notes on firefox HERE.

Backup vital things

It has been less than a month since I upgraded Fedora Core 9 to 10, and now I am jumping to 11. Not to worry. I am lazy, so I am skipping the whole business of cleaning out orphan packages and the like.

As for Core 10, I do a cover my butt backup of vital files:

mkdir -p /root/upgrade/f9-f10
cd  /root/upgrade/f9-f10
# gather info for potential recovery later
tar -C / -czf etc.tgz etc
rpm -qa --qf '%{NAME}-%{VERSION}-%{RELEASE}.%{ARCH}\n' | sort > rpm.ls.f9
chkconfig --list > chkconfig.ls.f9
ifconfig > ifconfig
route -n > route-n
df -h > df-h
cp -p /boot/grub/grub.conf grub.conf.f9

Switch repositories and do the upgrade

I move all the .repo files (for FC10) into /etc/yum.repos.d/FC10 - then I pull in the rpm with the new repos
yum clean all
rpm -Uhv ftp://download.fedora.redhat.com/pub/fedora/linux/releases/11/Fedora/x86_64/os/Packages/fedora-release-*.noarch.rpm
yum update rpm\* yum\*
I hand edit fedora.repo and fedora-updates.repo to use our local mirror. Then I try this:
yum update rpm\* yum\*
And it does not want to pull in anything, so I do:
yum clean all
yum update rpm\* yum\*
It now wants to pull in hundreds of packages, and it trips over missing dependencies:

Error: Missing Dependency: libssl.so.7()(64bit) is needed by package gpac-libs-0.4.5-0.5.20080217cvs.fc9.x86_64 (installed) Error: Missing Dependency: python(abi) = 2.5 is needed by package livna-config-display-0.0.23-1.fc9.noarch (installed) Error: Missing Dependency: libcrypto.so.7()(64bit) is needed by package gpac-libs-0.4.5-0.5.20080217cvs.fc9.x86_64 (installed)

Since these are all orphaned fc9 packages, it seems my lazy avoidance of scanning for and getting rid of orphaned packages just did not pan out. Some things to try are:
package-cleanup --dupes
package-cleanup --problems
rpm -Va --nofiles --nodigest
package-cleanup --leaves
package-cleanup --orphans
What I actually do try is the following. There are boatloads of old fc9 packages on my machine, and I am loath to get into a wholesale purge just yet:
package-cleanup --orphans | grep gpac
yum erase gpac-libs-0.4.5-0.5.20080217cvs.fc9.x86_64
package-cleanup --orphans | grep livna
yum erase livna-config-display-0.0.23-1.fc9.noarch
yum update rpm\* yum\*
This finds a plethora of interdependent packages to upgrade (over 400 as compared to maybe 16 when I did my FC10 upgrade), but we just dive in and let it do what it thinks is best. It gets stuck with:
Transaction Check Error:
  file /usr/share/man/man5/dhcp-eval.5.gz from install of dhcp-12:4.1.0-20.fc11.x86_64 conflicts with file from package dhclient-12:4.0.0-35.fc10.x86_64
  file /usr/share/man/man5/dhcp-options.5.gz from install of dhcp-12:4.1.0-20.fc11.x86_64 conflicts with file from package dhclient-12:4.0.0-35.fc10.x86_64
The solution to this is:
yum erase dhclient
yum update rpm\* yum\*

While this is chugging away, my network connectivity abruptly vanishes. My resolv.conf file is replaced (yet once again) by some bogus and useless thing that I am advised not to hand edit, and when I run ifconfig -a, I see that my interface has no assigned IP address. So I dutifully replace resolv.conf with the working file I keep on hand for such occasions, and then do service network restart and this gets me back running. Meanwhile the yum update seems to notice nothing amiss. I wish I knew what fedora had in mind with resolv.conf, but I don't know how to even begin sorting this out.
See the section down below under the heading "accursed Network Manager"

Well is is about 24 hours after starting the above. Doing the partial update of rpm and yum (and letting it run while I went home for the night) made a gigantic mess that I am only now digging out of. The fact that it wanted to update hundreds of packages instead of just 16 should have told me something. But my system is now back up, I have forgotten all the things I have fought my way through. All I can say is that FC11 seems to be one of those releases that tears the hell out of everything. At this stage, I am running FC11, but with the following unsolved problems (and probably others I just have not found yet):

gdm not starting

This was actually pretty simple. The default run level in /etc/inittab was set to 3 instead of 5, no doubt due to my X server being unable to start after I switched to FC11. (This required getting the latest nvidia driver from the Nvidia web site). The quick fix was a simple telinit 5, but the real fix is to edit the file /etc/inittab so the last line looks like this:
id:5:initdefault:

Nvidia driver

It would be a good idea to always grab the latest driver from the nvidia site before doing an upgrade, since often the very latest driver is required to work with really up to date kernels. I discovered a nice thing along the way though. The old NVIDIA script can be run with the --update option and it will fetch a newer driver if one exists (which is just what I did). This is really handy since it can be done without a working web browser (and you never have a working web browser when you are in a tangle with an older driver). After I got up and running, I did go the the NVIDIA site and downloaded the latest driver so I would have it handy.

That accursed network manager

I hate network manager, and you should too!

As near as I can tell, network manager (popularly known as network mangler), ALWAYS does the WRONG THING. In particular, a wrong thing it does is to trash my resolv.conf file on every reboot.

See my notes on how to nuke network manager from orbit.

Firefox

See my firefox notes.

Rails

I almost expect rails to break every time I do a fedora upgrade. I really cannot blame fedora in any way for this though, it all boils down to jumping to a major new version of rails when I went to FC11.

The best clue is in /u1/rails/micros/log.
When I look at the file mongrel.8000.log I see:

** Starting Mongrel listening at 127.0.0.1:8000
** Starting Rails with development environment...
/usr/lib/ruby/gems/1.8/gems/rails-2.3.2/lib/initializer.rb:580:in `send': undefined method `cache_template_extensions=' for ActionView::Base:Class (NoMethodError)
	from /usr/lib/ruby/gems/1.8/gems/rails-2.3.2/lib/initializer.rb:580:in `initialize_framework_settings'
	from /usr/lib/ruby/gems/1.8/gems/rails-2.3.2/lib/initializer.rb:579:in `each'
	from /usr/lib/ruby/gems/1.8/gems/rails-2.3.2/lib/initializer.rb:579:in `initialize_framework_settings'
	from /usr/lib/ruby/gems/1.8/gems/rails-2.3.2/lib/initializer.rb:576:in `each'
	from /usr/lib/ruby/gems/1.8/gems/rails-2.3.2/lib/initializer.rb:576:in `initialize_framework_settings'
	from /usr/lib/ruby/gems/1.8/gems/rails-2.3.2/lib/initializer.rb:155:in `process'
	from /usr/lib/ruby/gems/1.8/gems/rails-2.3.2/lib/initializer.rb:113:in `send'
	from /usr/lib/ruby/gems/1.8/gems/rails-2.3.2/lib/initializer.rb:113:in `run'
	 ... 12 levels...
	from /usr/lib/ruby/gems/1.8/gems/mongrel-1.1.5/bin/../lib/mongrel/command.rb:212:in `run'
	from /usr/lib/ruby/gems/1.8/gems/mongrel-1.1.5/bin/mongrel_rails:281
	from /usr/bin/mongrel_rails:19:in `load'
	from /usr/bin/mongrel_rails:19
Using Google to search on the above error, I find that two methods: are now deprecated (actually they seem gone altogether if you ask me). Apparently there have been deprecation warnings in my logs (that I never look at) for some time. The warnings have been telling me that the methods are deprecated, have no effect, and should be removed from my config files!

I find (and comment out) the offending lines.
In /u1/rails/micros/config/environments/development.rb I find the line:

config.action_view.cache_template_extensions         = false

In /u1/rails/ancient_app/config/environments/production.rb I find the line:
config.action_view.cache_template_loading         = true

Well now I can start the mongrel cluster, but I am getting into new troubles (it would seem that FC11 has moved on to Rails 2.3.2, which is a good thing, but causing pain for now). The next problem is that the application gives the message:

uninitialized constant ApplicationController
Apparently the file application.rb has been renamed to application_controller.rb The ugly hackish fix would be to rename this file or make a link to it. The recommended right way to move ahead is:
cd /u1/rails/micros
rake rails:update
This tells me:
/u1/rails/micros/app/controllers/application.rb has been renamed to /u1/rails/micros/app/controllers/application_controller.rb
I am exhorted to "update my SCM as necessary". Now my application is working again! I still need to figure out what my SCM is, find and update it. (A little research tells me that SCM stands for software configuration management, something like git or maybe svn).

Boot Messages

As of FC10, fedora now defaults to a graphical boot that does not show boot messages. I would rather see the boot messages (a spot of red is often the first warning I get of trouble), and I can't explain the thinking of those who make it hard to see them. They say that as long as the option "rhbg" is not on the boot line, you will see all the boot messages. Or you can hit the Escape key during startup. Or you can look at the file /var/log/boot.log after boot up. (This last trick is handy in the case where you just weren't watching carefully or something scrolled of the screen before you started paying close attention.

Apparently there is a bug and having norhgb on the boot line turns ON the graphical boot just the same as having rhbg on the boot line does. Now that I have edited grub.conf and removed norhgb I get to see the boot messages and I am happy.

svn

Where did svn go?
svn: error while loading shared libraries: libsvn_client-1.so.0: cannot open shared object file: No such file or directory
rpm -qf /usr/bin/svn
file /usr/bin/svn is not owned by any package
I do yum install subversion and everything seems to be fixed!

dhcpd

This would not start (the file /var/log/boot.log is handy for checking on what service do and do not start on boot up, given that this information either scrolls off the screen before details can be noted and/or is cleverly hidden by some glitzy graphical boot insanity.

The fix here is to learn that the dhcpd.conf file now lives in /etc/dhcp rather than being in /etc as it has for so long. On my system, a useless file named /etc/dhcp/dhcpd.conf was deposited during the upgrade, and my previous file was renamed to /etc/dhcpd.conf.rpmsave. This latter tricked me, as when I renamed it back to /etc/dhcpd.conf, I did not realize for quite some time that the useless file in /etc/dhcp/dhcpd.conf was the one being referenced. To fix this:

mv /etc/dhcpd.conf.rpmsave /etc/dhcp/dhcpd.conf
service dhcpd restart

MySQL

On one of my systems, there was no problem with mysql at all when I upgraded from FC10 to FC11.

On another, the mysqld service refuses to startup. With FC11, MySQL jumps from 5.0.77 to 5.1.35 - the major change from 5.0 to 5.1 is significant. Apparently with 5.1 there is a "plugin API" which requires a table called "mysql.plugin", which must be created by running mysql_upgrade. However this is impossible to do if the server is not running. This leads to (in some cases) an ugly Catch-22.

The errors I get look like:

090629 11:17:10 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql
/usr/libexec/mysqld: Table 'mysql.plugin' doesn't exist
090629 11:17:10 [ERROR] Can't open the mysql.plugin table. Please run mysql_upgrade to create it.
090629 11:17:10  InnoDB: Started; log sequence number 0 6800554
090629 11:17:10 [ERROR] Fatal error: Can't open and lock privilege tables: Incorrect key file for table 'host'; try to repair it
090629 11:17:10 mysqld_safe mysqld from pid file /var/run/mysqld/mysqld.pid ended
It is impossible to run mysqldump or scripts like mysql_fix_privilege_tables or mysql_upgrade since there is no server to connect to.

There may be more than one problem here. Some suggestions from a google search are:

The second suggestion was a useful lead, although the command to give is actually:
mysqld_safe --skip-grant-tables
This brings up mysqld without any authentication restrictions, so anybody can connect to the server (which is not good at all). What you are advised to immediately do is:
mysql
FLUSH PRIVILEGES;
In the normal case this would cause it to reread the grant tables, restoring all of the usual authentication requirements (but leaving your mysql session intact, a nice trick, and quite useful if you say need to clear a lost root password). However in my case, I just get the familiar message:
mysql
FLUSH PRIVILEGES;
Incorrect key file for table 'host'; try to repair it
use mysql
select * from host;
Incorrect key file for table 'host'; try to repair it
I find this is true for every table in entire mysql database (which seems to be the new thing for mysql 5.1 that is all buggered up).

Dive in and try to fix the mysql mess

Starting mysqld by skipping the grant tables is useful, as it allows me to run mysqldump and backup my application databases:

mysqld_safe --skip-grant-tables
mysqldump --opt first_database >first_database_sql.dump
mysqldump --opt second_database >second_database_sql.dump
After this, I kill the mysqld process and do this:
cd /var/lib/mysql/mysql
myisamchk -r *.MYI
service mysqld restart
This does indeed do a bunch of repairing. Unfortunately, mysqld does not start, now the messages in /var/log/mysql.log look like:
080717 18:52:34  mysqld started
080717 18:52:34  InnoDB: Started; log sequence number 0 4686097
080717 18:52:35 [Warning] Can't open and lock time zone table: Table 'mysql.time_zone_leap_second' doesn't exist trying to live without them
080717 18:52:35 [Note] /usr/libexec/mysqld: ready for connections.
Version: '5.0.51a'  socket: '/var/lib/mysql/mysql.sock'  port: 3306  Source distribution
080913 22:21:22 [Note] /usr/libexec/mysqld: Normal shutdown

After this I try the following commands in random order, but nothing cleans up the mess:

mysql_upgrade
mysql_fix_privilege_tables
myisamchk -r /var/lib/mysql/mysql/*.MYI
Actually, one should not run myisamchk while the server is also running, since even more terrible data corruption can occur if they both are accessing the same data.

Get out the big hammer and fix the mysql mess

In desperation, I do this:
cd /var/lib/mysql
mv mysql mysql_BAD_DOG
service mysqld restart
To my amazement, mysqld starts now. And it creates a new mysql directory to replace the one I just moved off to the side. It of course lacks our small collection of grant tables, but that won't be too hard to recreate:
grant CREATE,INSERT,DELETE,UPDATE,SELECT on stuff.* to 'ourstaff'@'localhost' identified by 'axlegrease';
Now it seems we are (or may be) back on the air.

iptables

I really cannot blame fedora for this one. Some of our machines use firewall scripts generated by firestarter. The script ends up in /etc/firestarter/firewall.sh. We make a symbolic link from /etc/sysconfig/iptables to this script (this is not a great idea). The iptables startup script then tries to feed this iptables-restore via:
iptables-restore /etc/sysconfig/iptables
This fails of course, but it fails with the iptables ruleset being cleared out, which is bad. The answer here is:
rm /etc/sysconfig/iptables
chkconfig iptables off
We invoke the firestarter script in rc.local.

ESC, escd, and smart cards

I don't really know what is going on here, but it is annoying. First, I see some escd process has been running (with a uid of me as a normal user) and has been consuming 100 percent of the CPU for 18,000 minutes or so, so I kill it and that seems to be that. Also just after I login, I either get some inscrutable warning about something called ESC still running. It looks like this:
ESC is already running, but is not responding, To open a new window, you must first close the existing ESC process, or restart your system.
Either that, or I get some silly dialog about smart cards, whatever the heck they are. (Apparently smart cards are some kind of authentication scheme that I don't envision using in the foreseeable future, if ever.

There must be some way to make all of this go away.

It would seem that ESC is the "Enterprise Security Client Smart Card Client" (it isn't clear if "SC" stands for security client or smart card).
I verify all this and ditch the package via the following:

yum list | grep esc
rpm -qi esc
yum erase esc

swap partition

For some reason, when doing one of my installs, the swap partition got labelled with type 83 instead of 82 and was not being recognized.

The command swapon -s tells what swap partitions are in use.

The following sets up the partition /dev/sdx3 to be used as swap:

fdisk /dev/sdx
t
3
82
w
q

mkswap /dev/sdx3
swapon /dev/sdx3

Get rid of the Lion, part I

It has only been two days, and I am already tired of this lion face on my desktop. This should be something I can fix via my desktop preferences. Indeed, this is easy, I just go to (I am running Gnome) System->Preferences->Appearance and select a different desktop background. I can even use the ADD button to insert one of my images. (It slices off the bottom of the image on my short and wide dual head setup).

My image 2006_9_evolution/img_2343.jpg is a very nice abstract image of waves on a High Sierra lake. Nicest of all, the slicing of the bottom of the image preserves the aspect ratio of the image, rather than stretching it across a pair of dual head monitors (that are letterbox aspect already).
Nice job!

Get rid of the Lion, part II

The above worked great for my desktop background when I am logged in, but when I log out and gdm is offering to log users in, we are back to the Lion once again. There must be a way to change the image used by gdm. Indeed there is, and this can be fiddled via GUI style tools. In fact via exactly the same tool as I used in part I above.

Go to System->Preferences->Appearance and select a background, and after selecting it, click on the "Make default" button at the bottom of the GUI. This will cause the background to be set for both the desktop while logged in, and the gdm background when the gdm "greeter" is up. After doing this, you can select a different background for the "logged in" background only (just don't click the "Make default" button.

This works perfectly for what I want just now, and the Lion is gone entirely for my way of life. It is worth (sort of) pointing out that this opens up the possibility of "gdm background tug-of-wars" between two different users who are using the "Make default" button.

Someday I would like to run a script that sets a different background for each day of the week (one for Monday, ....). The following are some leads that might be followed up in an effort to figure out how to do that:

Themes are (or were) in /usr/share/backgrounds. The directory /usr/share/backgrounds/leonidas and the file /usr/share/backgrounds/leonidas/leonidas.xml are suggestive, but have not been modified by the GUI fiddling above, so there is need to look deeper.

The following command has been suggested to fiddle with the gdm background:

su -c 'gconftool-2 --direct --config-source xml:readwrite:/var/lib/gdm/.gconf -s --type string /desktop/gnome/background/picture_filename /home/foo/background.jpg' gdm

or the following to set and then verify a background:

gconftool-2 --set --type String /desktop/gnome/background/picture_filename /usr/share/backgrounds/mysolar/mysolar.xml
gconftool-2 --get /desktop/gnome/background/picture_filename

Have any comments? Questions? Drop me a line!

Adventures in Computing / [email protected]