rpc.gssd: Inexplicable "Unknown code" error

Kevin Coffman kwc at citi.umich.edu
Tue Feb 5 23:45:18 EST 2008


On Feb 5, 2008 8:43 PM, Nathan Patwardhan <noopy.org at gmail.com> wrote:
> Hello,
>
> We recently tested NFSv4 in our dev environment and qualified it with
> 3 Linux distros (Debian Etch, Ubuntu Gutsy, SLES 10).  Our NFS server
> is a NetApp and our KDC is Win2k3.  Anyhow, since it worked in
> development, what's the worst thing that could happen when we tried to
> make it work in production?  :-)
>
> Under SLES-10, where we'd had things working in dev previously, I now
> see these errors when I try to mount our production NFS server:
>
>   Feb  6 01:29:28 prod-unix-shell03 rpc.gssd[11402]: rpcsec_gss:
> gss_init_sec_context: (major) Miscellaneous failure - (minor)
>   Unknown code
>   Feb  6 01:29:28 prod-unix-shell03 rpc.gssd[11402]: in authgss_destroy()
>   Feb  6 01:29:28 prod-unix-shell03 rpc.gssd[11402]: in
> authgss_destroy_context()
>   Feb  6 01:29:28 prod-unix-shell03 rpc.gssd[11402]: authgss_destroy:
> freeing name 0x50b570
>   Feb  6 01:29:28 prod-unix-shell03 rpc.gssd[11402]:
> authgss_create_default: freeing name 0x51a760
>   Feb  6 01:29:28 prod-unix-shell03 rpc.gssd[11402]: WARNING: Failed
> to create krb5 context for user with uid 0 for server
>   prod-fs-cc1a-pubnet.kendall.corp.my.domain
>   Feb  6 01:29:28 prod-unix-shell03 rpc.gssd[11402]: WARNING: Failed
> to create krb5 context for user with uid 0 with credentials
>   cache FILE:/tmp/krb5cc_machine_CORP.MY.DOMAIN for server
> prod-fs-cc1a-pubnet.kendall.corp.my.domain
>   Feb  6 01:29:28 prod-unix-shell03 rpc.gssd[11402]: WARNING: Failed
> to create krb5 context for user with uid 0 with any credentials cache
> for server prod-fs-cc1a-pubnet.kendall.corp.my.domain
>
> For whatever reason, in production, /tmp/krb5cc_machine_CORP.MY.DOMAIN
> is missing a service principle for NFS:
>
>  # klist /tmp/krb5cc_machine_CORP.MY.DOMAIN
> Ticket cache: FILE:/tmp/krb5cc_machine_CORP.MY.DOMAIN
> Default principal: nfs/prod-unix-shell03.kendall.corp.my.domain at CORP.MY.DOMAIN
> Valid starting     Expires            Service principal
> 02/06/08 01:29:27  02/06/08 11:29:27  krbtgt/CORP.AKAMAI.COM at CORP.MY.DOMAIN
>         renew until 02/06/08 11:29:27
>
> Note that in our dev environment, we ARE NOT missing a service
> principle for NFS here.  Obviously, this is the nature of our problem.
>  But why might this be?
>
>   - There's an SPN for our NFS server in our Windows KDC.
>   - DNS seems to be correct for our NFS server.
>   - keytab on the NFS server seems to be correct (verified service
> principal w/klist).
>   - Unknown error is shown, but unknown how?  I don't see any code here.
>
> Any ideas as to what's going on?

Argh.  The easiest way to figure out what is happening would be to get
a packet trace and see what principal the client is trying to get a
service ticket for.  My initial guess is that there might be a
Kerberos configuration issue and it is trying to get a ticket from the
wrong realm.  (Look at the [domain_realm] mappings?)

The "unknown code" is unfortunate.  We ask the gss mechanism library
to decode the error code.  I'm not sure why it wouldn't be able to do
that.  Figuring out what the code is (from the packet trace) would be
the first step.  I'll also look into printing the actual error code
along with the text to help debugging in the future.

K.C.


More information about the NFSv4 mailing list