MCS plugins going mad/MCS manager thread problem?

Benedikt Meurer benedikt.meurer at unix-ag.uni-siegen.de
Wed Mar 10 23:43:56 CET 2004


So, gentlemen, please take a seat. I', going to report about one of the most
neckbreaking bugs I've ever seen:

It happens that the mcs manager does not always store settings for some of the
mcs plugins, most notably the xfwm4, margins, workspace and gtk. Instead it
creates a file with a size of 0, but it stores settings for e.g. session
without problems. I should note that this happens randomly, and that I'm not
able to reproduce it everyday. But today I was lucky being able to reproduce
it (I've seen this on FreeBSD and NetBSD so far, I think I've also encountered
it on Solaris, but not sure anymore). This is currently FreeBSD/i386 
5.2.1-RELEASE.

So I rewrote the mcs-manager saving logic and added all kinds of error and
debug reporting stuff. The actual implementation now looks like this (it uses
atomic saving instead of locking now):

mcs_manager_save_channel_to_file (...)
{
  g_snprintf (tmp_path, PATH_MAX, "%s.tmp", filename);

  fp = fopen (tmp_path, "w");
  if (fp == NULL) { ... }

  /* fprintf the contents */

  g_message ("file %s: fp->_flags = %d, fp->_file = %d", filename,
             (int)fp->_flags, (int)fp->_file)

  if (fclose (fp) == EOF) { bail! }
  if (rename (tmp_path, filename) < 0) { ...}
}

Now, if I launch the xfce-mcs-manager and go directly to the xfwm4 settings
and change something there, it tries to use fileno 4 and fclose() bails out
with EBADF, and the resulting file is 0 bytes. But ones I do some stuff in the
session plugin and return to the xfwm4 plugin, it uses fileno 11 and succeeds.

Even more confusing, if I modify the above and replace the fopen() line with

  fp = fopen (tmp_path, "w");
  FILE *fp1 = fopen (tmp_path, "w");
  if (fp != NULL)
   fclose (fp);
  fp = fp1;

the xfwm4 succeeds on the first run as well.

Now, given the fact that the stdio stuff is proven to work over ages and
taking into account that it works for e.g. the session plugin, I came to the
conclusion that some of our mcs plugins are going mad. I'm checking the
plugins and tracing stdio functions for over 7 hours now, but I can't seem to
find anything that would explain whats happening here.

For the curious, heres what ktrace gives me for the xfwm4 plugin:

43298 xfce-mcs-manager CALL  open(0xbfbfd430,0x601,0x1b6)
  43298 xfce-mcs-manager NAMI  "/home/bmeurer/.xfce4/settings/xfwm4.xml.tmp"
  43298 xfce-mcs-manager RET   open 4
  43298 xfce-mcs-manager CALL  fstat(0x4,0xbfbfd2b0)
  43298 xfce-mcs-manager RET   fstat 0
  43298 xfce-mcs-manager CALL  break(0x82a4000)
  43298 xfce-mcs-manager RET   break 0
  43298 xfce-mcs-manager CALL  write(0x2,0x829d000,0x64)
  43298 xfce-mcs-manager GIO   fd 2 wrote 100 bytes
        "libxfce4mcs-Message: file /home/bmeurer/.xfce4/settings/xfwm4.xml: fp->\
	_flags = 1160, fp->_file = 4
        "
  43298 xfce-mcs-manager RET   write 100/0x64
  43298 xfce-mcs-manager CALL  close(0x4)
  43298 xfce-mcs-manager RET   close 0
  43298 xfce-mcs-manager CALL  getpid
  43298 xfce-mcs-manager RET   getpid 43298/0xa922
  43298 xfce-mcs-manager CALL  write(0x2,0x8293000,0x95)
  43298 xfce-mcs-manager GIO   fd 2 wrote 149 bytes
        "
	(xfce-mcs-manager:43298): libxfce4mcs-CRITICAL **: Unable to close file\
	 handle for /home/bmeurer/.xfce4/settings/xfwm4.xml.tmp: Bad file descr\
	iptor
        "
  43298 xfce-mcs-manager RET   write 149/0x95

And heres what it gives for the margins plugin (which saves data twice(?!) in
the init function as well, which succeeds for some reason?!):

43298 xfce-mcs-manager CALL  open(0xbfbfddc0,0x601,0x1b6)
  43298 xfce-mcs-manager NAMI  "/home/bmeurer/.xfce4/settings/margins.xml.tmp"
  43298 xfce-mcs-manager RET   open 9
  43298 xfce-mcs-manager CALL  fstat(0x9,0xbfbfdc40)
  43298 xfce-mcs-manager RET   fstat 0
  43298 xfce-mcs-manager CALL  write(0x2,0x807b380,0x66)
  43298 xfce-mcs-manager GIO   fd 2 wrote 102 bytes
        "libxfce4mcs-Message: file /home/bmeurer/.xfce4/settings/margins.xml: fp\
	->_flags = 1160, fp->_file = 9
        "
  43298 xfce-mcs-manager RET   write 102/0x66
  43298 xfce-mcs-manager CALL  write(0x9,0x80a7000,0x150)
  43298 xfce-mcs-manager GIO   fd 9 wrote 336 bytes
        "<?xml version="1.0" encoding="UTF-8"?>
	<!DOCTYPE mcs-option SYSTEM "mcs-option.dtd">
	
	<mcs-option>
		<option name="Xfwm/BottomMargin" type="int" value="0"/>
		<option name="Xfwm/LeftMargin" type="int" value="0"/>
		<option name="Xfwm/RightMargin" type="int" value="0"/>
		<option name="Xfwm/TopMargin" type="int" value="42"/>
	</mcs-option>
        "
  43298 xfce-mcs-manager RET   write 336/0x150
  43298 xfce-mcs-manager CALL  close(0x9)
  43298 xfce-mcs-manager RET   close 0
  43298 xfce-mcs-manager CALL  rename(0xbfbfddc0,0x8075d80)
  43298 xfce-mcs-manager NAMI  "/home/bmeurer/.xfce4/settings/margins.xml.tmp"
  43298 xfce-mcs-manager NAMI  "/home/bmeurer/.xfce4/settings/margins.xml"
  43298 xfce-mcs-manager RET   rename 0
  43298 xfce-mcs-manager CALL  write(0x2,0x807b380,0x6d)
  43298 xfce-mcs-manager GIO   fd 2 wrote 109 bytes
        "** Message:   module /opt/xfce/lib/xfce4/mcs-plugins/workspaces_plugin.\
	so ("workspaces") successfully loaded
        "
  43298 xfce-mcs-manager RET   write 109/0x6d

As you can see, the file is opened successfully in both cases (fd 4 in the 
xfwm4 case and fd 9 in the margins case). In the xfwm4 case, the stdio 
functions fail for some reason (the fprintf's pretend that they worked, 
atleast their return values are ok), so I think its the implicit flush called 
by fclose that failes.

And for completeness, heres the source code in question:

gboolean
mcs_manager_save_channel_to_file (McsManager  *manager,
				  const gchar *channel_name,
				  const gchar *filename)
{
   McsSetting *setting;
   McsList *iter;
   McsList *list;
   FILE *fp;
   int fd;
   gchar tmp_path[PATH_MAX];

   g_return_val_if_fail(manager != NULL, FALSE);
   g_return_val_if_fail((filename != NULL) || (strlen(filename) > 0), FALSE);
   g_return_val_if_fail((channel_name != NULL) || (strlen(channel_name) > 0),
		       FALSE);

   g_snprintf (tmp_path, PATH_MAX, "%s.tmp", filename);

   fp = fopen (tmp_path, "w");
#if 0
   {
     FILE *fp1 = fopen (tmp_path, "w");
     if (fp != NULL)
       fclose (fp);
     fp = fp1;
   }
#endif
   if (fp == NULL)
     {
       g_critical ("Unable to open file %s to store channel \"%s\" to: %s",
		  tmp_path,
		  channel_name,
		  g_strerror (errno));
       return FALSE;
     }

   /* Write header */
   fprintf (fp,
	   "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n"
	   "<!DOCTYPE mcs-option SYSTEM \"mcs-option.dtd\">\n"
	   "\n"
	   "<mcs-option>\n");

   list = mcs_manager_list_lookup (manager, channel_name);

   for (iter = list; iter != NULL; iter = iter->next)
     {
       setting = iter->setting;

       switch (setting->type) {
       case MCS_TYPE_INT:
	fprintf (fp,
		 "\t<option name=\"%s\" type=\"int\" value=\"%i\"/>\n",
		 setting->name, setting->data.v_int);
	break;
		
       case MCS_TYPE_COLOR:
	fprintf (fp, "\t<option name=\"%s\" type=\"color\" "
		 "value=\"%16u,%16u,%16u,%16u\"/>\n",
		 setting->name,
		 setting->data.v_color.red,
		 setting->data.v_color.green,
		 setting->data.v_color.blue,
		 setting->data.v_color.alpha);
	break;
	
       case MCS_TYPE_STRING:
	fprintf (fp, "\t<option name=\"%s\" type=\"string\" "
		 "value=\"%s\"/>\n",
		 setting->name,
		 setting->data.v_string);
	break;

       default:
	break;
       }
     }

   fprintf (fp, "</mcs-option>\n");

   g_message ("file %s: fp->_flags = %d, fp->_file = %d", filename, 
(int)fp->_flags, (int)fp->_file);

   if (fclose (fp) == EOF)
     {
       g_critical ("Unable to close file handle for %s: %s",
		  tmp_path,
		  g_strerror (errno));
       return FALSE;
     }

   if (rename (tmp_path, filename) < 0)
     {
       g_critical ("Unable to rename file %s to %s: %s",
		  tmp_path,
		  filename,
		  g_strerror (errno));
       return FALSE;
     }

   return TRUE;
}

As you can see its very straight forward.

[two hours later]

Maybe anyone knows more about this than I do, therefore I'll try to explain 
what I figured out so far (I'll maybe file a freebsd problem report as well, 
though I still think that the problem is with some of our plugins or with 
related to gmodule):

After recompiling system libraries for extensive debugging, I located the 
problem in libc_r/uthread/uthread_fd.c. When _thread_fd_getflags() is called 
from _swrite(), and when its called, the flags for the given fd are set to 0 
instead of O_WRONLY as they should be (the file was opened "w"). The 
_thread_fd_table entry for fd exists though. Therefore the fd entry was setup 
prior to this, but for some reason it was borked.

To get around this, I linked xfce-mcs-manager with -pthread/-lc_r which seems 
to fix the problem, but its IMHO not a real solution. Its more like a hack. 
Anyway, I'll modify xfce-mcs-manager to include -pthread/-lc_r for FreeBSD 5.x.

So now, if anyone has any particular idea what could be going on here, please 
drop me a note.

regards,
Benedikt

-- 
NetBSD Operating system:                       http://www.NetBSD.org/
pkgsrc "Work in progress":                  http://pkgsrc-wip.sf.net/
XFce desktop environment:                        http://www.xfce.org/
German Unix-AG Association:                   http://www.unix-ag.org/
os-network:                                 http://www.os-network.de/

OpenPGP Key: http://www.home.unix-ag.org/bmeurer/#gpg




More information about the Xfce4-dev mailing list