This is libc.info, produced by makeinfo version 5.2 from libc.texinfo.
|
|
This file documents the GNU C Library.
|
|
This is ‘The GNU C Library Reference Manual’, for version 2.25.
|
|
Copyright © 1993–2017 Free Software Foundation, Inc.
|
|
Permission is granted to copy, distribute and/or modify this document
|
under the terms of the GNU Free Documentation License, Version 1.3 or
|
any later version published by the Free Software Foundation; with the
|
Invariant Sections being “Free Software Needs Free Documentation” and
|
“GNU Lesser General Public License”, the Front-Cover texts being “A GNU
|
Manual”, and with the Back-Cover Texts as in (a) below. A copy of the
|
license is included in the section entitled "GNU Free Documentation
|
License".
|
|
(a) The FSF’s Back-Cover Text is: “You have the freedom to copy and
|
modify this GNU manual. Buying copies from the FSF supports it in
|
developing GNU and promoting software freedom.”
|
INFO-DIR-SECTION Software libraries
|
START-INFO-DIR-ENTRY
|
* Libc: (libc). C library.
|
END-INFO-DIR-ENTRY
|
|
INFO-DIR-SECTION GNU C library functions and macros
|
START-INFO-DIR-ENTRY
|
* ALTWERASE: (libc)Local Modes.
|
* ARGP_ERR_UNKNOWN: (libc)Argp Parser Functions.
|
* ARG_MAX: (libc)General Limits.
|
* BC_BASE_MAX: (libc)Utility Limits.
|
* BC_DIM_MAX: (libc)Utility Limits.
|
* BC_SCALE_MAX: (libc)Utility Limits.
|
* BC_STRING_MAX: (libc)Utility Limits.
|
* BRKINT: (libc)Input Modes.
|
* BUFSIZ: (libc)Controlling Buffering.
|
* CCTS_OFLOW: (libc)Control Modes.
|
* CHILD_MAX: (libc)General Limits.
|
* CIGNORE: (libc)Control Modes.
|
* CLK_TCK: (libc)Processor Time.
|
* CLOCAL: (libc)Control Modes.
|
* CLOCKS_PER_SEC: (libc)CPU Time.
|
* COLL_WEIGHTS_MAX: (libc)Utility Limits.
|
* CPU_CLR: (libc)CPU Affinity.
|
* CPU_ISSET: (libc)CPU Affinity.
|
* CPU_SET: (libc)CPU Affinity.
|
* CPU_SETSIZE: (libc)CPU Affinity.
|
* CPU_ZERO: (libc)CPU Affinity.
|
* CREAD: (libc)Control Modes.
|
* CRTS_IFLOW: (libc)Control Modes.
|
* CS5: (libc)Control Modes.
|
* CS6: (libc)Control Modes.
|
* CS7: (libc)Control Modes.
|
* CS8: (libc)Control Modes.
|
* CSIZE: (libc)Control Modes.
|
* CSTOPB: (libc)Control Modes.
|
* DES_FAILED: (libc)DES Encryption.
|
* DTTOIF: (libc)Directory Entries.
|
* E2BIG: (libc)Error Codes.
|
* EACCES: (libc)Error Codes.
|
* EADDRINUSE: (libc)Error Codes.
|
* EADDRNOTAVAIL: (libc)Error Codes.
|
* EADV: (libc)Error Codes.
|
* EAFNOSUPPORT: (libc)Error Codes.
|
* EAGAIN: (libc)Error Codes.
|
* EALREADY: (libc)Error Codes.
|
* EAUTH: (libc)Error Codes.
|
* EBACKGROUND: (libc)Error Codes.
|
* EBADE: (libc)Error Codes.
|
* EBADF: (libc)Error Codes.
|
* EBADFD: (libc)Error Codes.
|
* EBADMSG: (libc)Error Codes.
|
* EBADR: (libc)Error Codes.
|
* EBADRPC: (libc)Error Codes.
|
* EBADRQC: (libc)Error Codes.
|
* EBADSLT: (libc)Error Codes.
|
* EBFONT: (libc)Error Codes.
|
* EBUSY: (libc)Error Codes.
|
* ECANCELED: (libc)Error Codes.
|
* ECHILD: (libc)Error Codes.
|
* ECHO: (libc)Local Modes.
|
* ECHOCTL: (libc)Local Modes.
|
* ECHOE: (libc)Local Modes.
|
* ECHOK: (libc)Local Modes.
|
* ECHOKE: (libc)Local Modes.
|
* ECHONL: (libc)Local Modes.
|
* ECHOPRT: (libc)Local Modes.
|
* ECHRNG: (libc)Error Codes.
|
* ECOMM: (libc)Error Codes.
|
* ECONNABORTED: (libc)Error Codes.
|
* ECONNREFUSED: (libc)Error Codes.
|
* ECONNRESET: (libc)Error Codes.
|
* ED: (libc)Error Codes.
|
* EDEADLK: (libc)Error Codes.
|
* EDEADLOCK: (libc)Error Codes.
|
* EDESTADDRREQ: (libc)Error Codes.
|
* EDIED: (libc)Error Codes.
|
* EDOM: (libc)Error Codes.
|
* EDOTDOT: (libc)Error Codes.
|
* EDQUOT: (libc)Error Codes.
|
* EEXIST: (libc)Error Codes.
|
* EFAULT: (libc)Error Codes.
|
* EFBIG: (libc)Error Codes.
|
* EFTYPE: (libc)Error Codes.
|
* EGRATUITOUS: (libc)Error Codes.
|
* EGREGIOUS: (libc)Error Codes.
|
* EHOSTDOWN: (libc)Error Codes.
|
* EHOSTUNREACH: (libc)Error Codes.
|
* EHWPOISON: (libc)Error Codes.
|
* EIDRM: (libc)Error Codes.
|
* EIEIO: (libc)Error Codes.
|
* EILSEQ: (libc)Error Codes.
|
* EINPROGRESS: (libc)Error Codes.
|
* EINTR: (libc)Error Codes.
|
* EINVAL: (libc)Error Codes.
|
* EIO: (libc)Error Codes.
|
* EISCONN: (libc)Error Codes.
|
* EISDIR: (libc)Error Codes.
|
* EISNAM: (libc)Error Codes.
|
* EKEYEXPIRED: (libc)Error Codes.
|
* EKEYREJECTED: (libc)Error Codes.
|
* EKEYREVOKED: (libc)Error Codes.
|
* EL2HLT: (libc)Error Codes.
|
* EL2NSYNC: (libc)Error Codes.
|
* EL3HLT: (libc)Error Codes.
|
* EL3RST: (libc)Error Codes.
|
* ELIBACC: (libc)Error Codes.
|
* ELIBBAD: (libc)Error Codes.
|
* ELIBEXEC: (libc)Error Codes.
|
* ELIBMAX: (libc)Error Codes.
|
* ELIBSCN: (libc)Error Codes.
|
* ELNRNG: (libc)Error Codes.
|
* ELOOP: (libc)Error Codes.
|
* EMEDIUMTYPE: (libc)Error Codes.
|
* EMFILE: (libc)Error Codes.
|
* EMLINK: (libc)Error Codes.
|
* EMSGSIZE: (libc)Error Codes.
|
* EMULTIHOP: (libc)Error Codes.
|
* ENAMETOOLONG: (libc)Error Codes.
|
* ENAVAIL: (libc)Error Codes.
|
* ENEEDAUTH: (libc)Error Codes.
|
* ENETDOWN: (libc)Error Codes.
|
* ENETRESET: (libc)Error Codes.
|
* ENETUNREACH: (libc)Error Codes.
|
* ENFILE: (libc)Error Codes.
|
* ENOANO: (libc)Error Codes.
|
* ENOBUFS: (libc)Error Codes.
|
* ENOCSI: (libc)Error Codes.
|
* ENODATA: (libc)Error Codes.
|
* ENODEV: (libc)Error Codes.
|
* ENOENT: (libc)Error Codes.
|
* ENOEXEC: (libc)Error Codes.
|
* ENOKEY: (libc)Error Codes.
|
* ENOLCK: (libc)Error Codes.
|
* ENOLINK: (libc)Error Codes.
|
* ENOMEDIUM: (libc)Error Codes.
|
* ENOMEM: (libc)Error Codes.
|
* ENOMSG: (libc)Error Codes.
|
* ENONET: (libc)Error Codes.
|
* ENOPKG: (libc)Error Codes.
|
* ENOPROTOOPT: (libc)Error Codes.
|
* ENOSPC: (libc)Error Codes.
|
* ENOSR: (libc)Error Codes.
|
* ENOSTR: (libc)Error Codes.
|
* ENOSYS: (libc)Error Codes.
|
* ENOTBLK: (libc)Error Codes.
|
* ENOTCONN: (libc)Error Codes.
|
* ENOTDIR: (libc)Error Codes.
|
* ENOTEMPTY: (libc)Error Codes.
|
* ENOTNAM: (libc)Error Codes.
|
* ENOTRECOVERABLE: (libc)Error Codes.
|
* ENOTSOCK: (libc)Error Codes.
|
* ENOTSUP: (libc)Error Codes.
|
* ENOTTY: (libc)Error Codes.
|
* ENOTUNIQ: (libc)Error Codes.
|
* ENXIO: (libc)Error Codes.
|
* EOF: (libc)EOF and Errors.
|
* EOPNOTSUPP: (libc)Error Codes.
|
* EOVERFLOW: (libc)Error Codes.
|
* EOWNERDEAD: (libc)Error Codes.
|
* EPERM: (libc)Error Codes.
|
* EPFNOSUPPORT: (libc)Error Codes.
|
* EPIPE: (libc)Error Codes.
|
* EPROCLIM: (libc)Error Codes.
|
* EPROCUNAVAIL: (libc)Error Codes.
|
* EPROGMISMATCH: (libc)Error Codes.
|
* EPROGUNAVAIL: (libc)Error Codes.
|
* EPROTO: (libc)Error Codes.
|
* EPROTONOSUPPORT: (libc)Error Codes.
|
* EPROTOTYPE: (libc)Error Codes.
|
* EQUIV_CLASS_MAX: (libc)Utility Limits.
|
* ERANGE: (libc)Error Codes.
|
* EREMCHG: (libc)Error Codes.
|
* EREMOTE: (libc)Error Codes.
|
* EREMOTEIO: (libc)Error Codes.
|
* ERESTART: (libc)Error Codes.
|
* ERFKILL: (libc)Error Codes.
|
* EROFS: (libc)Error Codes.
|
* ERPCMISMATCH: (libc)Error Codes.
|
* ESHUTDOWN: (libc)Error Codes.
|
* ESOCKTNOSUPPORT: (libc)Error Codes.
|
* ESPIPE: (libc)Error Codes.
|
* ESRCH: (libc)Error Codes.
|
* ESRMNT: (libc)Error Codes.
|
* ESTALE: (libc)Error Codes.
|
* ESTRPIPE: (libc)Error Codes.
|
* ETIME: (libc)Error Codes.
|
* ETIMEDOUT: (libc)Error Codes.
|
* ETOOMANYREFS: (libc)Error Codes.
|
* ETXTBSY: (libc)Error Codes.
|
* EUCLEAN: (libc)Error Codes.
|
* EUNATCH: (libc)Error Codes.
|
* EUSERS: (libc)Error Codes.
|
* EWOULDBLOCK: (libc)Error Codes.
|
* EXDEV: (libc)Error Codes.
|
* EXFULL: (libc)Error Codes.
|
* EXIT_FAILURE: (libc)Exit Status.
|
* EXIT_SUCCESS: (libc)Exit Status.
|
* EXPR_NEST_MAX: (libc)Utility Limits.
|
* FD_CLOEXEC: (libc)Descriptor Flags.
|
* FD_CLR: (libc)Waiting for I/O.
|
* FD_ISSET: (libc)Waiting for I/O.
|
* FD_SET: (libc)Waiting for I/O.
|
* FD_SETSIZE: (libc)Waiting for I/O.
|
* FD_ZERO: (libc)Waiting for I/O.
|
* FE_SNANS_ALWAYS_SIGNAL: (libc)Infinity and NaN.
|
* FILENAME_MAX: (libc)Limits for Files.
|
* FLUSHO: (libc)Local Modes.
|
* FOPEN_MAX: (libc)Opening Streams.
|
* FP_ILOGB0: (libc)Exponents and Logarithms.
|
* FP_ILOGBNAN: (libc)Exponents and Logarithms.
|
* FP_LLOGB0: (libc)Exponents and Logarithms.
|
* FP_LLOGBNAN: (libc)Exponents and Logarithms.
|
* F_DUPFD: (libc)Duplicating Descriptors.
|
* F_GETFD: (libc)Descriptor Flags.
|
* F_GETFL: (libc)Getting File Status Flags.
|
* F_GETLK: (libc)File Locks.
|
* F_GETOWN: (libc)Interrupt Input.
|
* F_OFD_GETLK: (libc)Open File Description Locks.
|
* F_OFD_SETLK: (libc)Open File Description Locks.
|
* F_OFD_SETLKW: (libc)Open File Description Locks.
|
* F_OK: (libc)Testing File Access.
|
* F_SETFD: (libc)Descriptor Flags.
|
* F_SETFL: (libc)Getting File Status Flags.
|
* F_SETLK: (libc)File Locks.
|
* F_SETLKW: (libc)File Locks.
|
* F_SETOWN: (libc)Interrupt Input.
|
* HUGE_VAL: (libc)Math Error Reporting.
|
* HUGE_VALF: (libc)Math Error Reporting.
|
* HUGE_VALL: (libc)Math Error Reporting.
|
* HUPCL: (libc)Control Modes.
|
* I: (libc)Complex Numbers.
|
* ICANON: (libc)Local Modes.
|
* ICRNL: (libc)Input Modes.
|
* IEXTEN: (libc)Local Modes.
|
* IFNAMSIZ: (libc)Interface Naming.
|
* IFTODT: (libc)Directory Entries.
|
* IGNBRK: (libc)Input Modes.
|
* IGNCR: (libc)Input Modes.
|
* IGNPAR: (libc)Input Modes.
|
* IMAXBEL: (libc)Input Modes.
|
* INADDR_ANY: (libc)Host Address Data Type.
|
* INADDR_BROADCAST: (libc)Host Address Data Type.
|
* INADDR_LOOPBACK: (libc)Host Address Data Type.
|
* INADDR_NONE: (libc)Host Address Data Type.
|
* INFINITY: (libc)Infinity and NaN.
|
* INLCR: (libc)Input Modes.
|
* INPCK: (libc)Input Modes.
|
* IPPORT_RESERVED: (libc)Ports.
|
* IPPORT_USERRESERVED: (libc)Ports.
|
* ISIG: (libc)Local Modes.
|
* ISTRIP: (libc)Input Modes.
|
* IXANY: (libc)Input Modes.
|
* IXOFF: (libc)Input Modes.
|
* IXON: (libc)Input Modes.
|
* LINE_MAX: (libc)Utility Limits.
|
* LINK_MAX: (libc)Limits for Files.
|
* L_ctermid: (libc)Identifying the Terminal.
|
* L_cuserid: (libc)Who Logged In.
|
* L_tmpnam: (libc)Temporary Files.
|
* MAXNAMLEN: (libc)Limits for Files.
|
* MAXSYMLINKS: (libc)Symbolic Links.
|
* MAX_CANON: (libc)Limits for Files.
|
* MAX_INPUT: (libc)Limits for Files.
|
* MB_CUR_MAX: (libc)Selecting the Conversion.
|
* MB_LEN_MAX: (libc)Selecting the Conversion.
|
* MDMBUF: (libc)Control Modes.
|
* MSG_DONTROUTE: (libc)Socket Data Options.
|
* MSG_OOB: (libc)Socket Data Options.
|
* MSG_PEEK: (libc)Socket Data Options.
|
* NAME_MAX: (libc)Limits for Files.
|
* NAN: (libc)Infinity and NaN.
|
* NCCS: (libc)Mode Data Types.
|
* NGROUPS_MAX: (libc)General Limits.
|
* NOFLSH: (libc)Local Modes.
|
* NOKERNINFO: (libc)Local Modes.
|
* NSIG: (libc)Standard Signals.
|
* NULL: (libc)Null Pointer Constant.
|
* ONLCR: (libc)Output Modes.
|
* ONOEOT: (libc)Output Modes.
|
* OPEN_MAX: (libc)General Limits.
|
* OPOST: (libc)Output Modes.
|
* OXTABS: (libc)Output Modes.
|
* O_ACCMODE: (libc)Access Modes.
|
* O_APPEND: (libc)Operating Modes.
|
* O_ASYNC: (libc)Operating Modes.
|
* O_CREAT: (libc)Open-time Flags.
|
* O_EXCL: (libc)Open-time Flags.
|
* O_EXEC: (libc)Access Modes.
|
* O_EXLOCK: (libc)Open-time Flags.
|
* O_FSYNC: (libc)Operating Modes.
|
* O_IGNORE_CTTY: (libc)Open-time Flags.
|
* O_NDELAY: (libc)Operating Modes.
|
* O_NOATIME: (libc)Operating Modes.
|
* O_NOCTTY: (libc)Open-time Flags.
|
* O_NOLINK: (libc)Open-time Flags.
|
* O_NONBLOCK: (libc)Open-time Flags.
|
* O_NONBLOCK: (libc)Operating Modes.
|
* O_NOTRANS: (libc)Open-time Flags.
|
* O_RDONLY: (libc)Access Modes.
|
* O_RDWR: (libc)Access Modes.
|
* O_READ: (libc)Access Modes.
|
* O_SHLOCK: (libc)Open-time Flags.
|
* O_SYNC: (libc)Operating Modes.
|
* O_TRUNC: (libc)Open-time Flags.
|
* O_WRITE: (libc)Access Modes.
|
* O_WRONLY: (libc)Access Modes.
|
* PARENB: (libc)Control Modes.
|
* PARMRK: (libc)Input Modes.
|
* PARODD: (libc)Control Modes.
|
* PATH_MAX: (libc)Limits for Files.
|
* PA_FLAG_MASK: (libc)Parsing a Template String.
|
* PENDIN: (libc)Local Modes.
|
* PF_FILE: (libc)Local Namespace Details.
|
* PF_INET6: (libc)Internet Namespace.
|
* PF_INET: (libc)Internet Namespace.
|
* PF_LOCAL: (libc)Local Namespace Details.
|
* PF_UNIX: (libc)Local Namespace Details.
|
* PIPE_BUF: (libc)Limits for Files.
|
* P_tmpdir: (libc)Temporary Files.
|
* RAND_MAX: (libc)ISO Random.
|
* RE_DUP_MAX: (libc)General Limits.
|
* RLIM_INFINITY: (libc)Limits on Resources.
|
* R_OK: (libc)Testing File Access.
|
* SA_NOCLDSTOP: (libc)Flags for Sigaction.
|
* SA_ONSTACK: (libc)Flags for Sigaction.
|
* SA_RESTART: (libc)Flags for Sigaction.
|
* SEEK_CUR: (libc)File Positioning.
|
* SEEK_END: (libc)File Positioning.
|
* SEEK_SET: (libc)File Positioning.
|
* SIGABRT: (libc)Program Error Signals.
|
* SIGALRM: (libc)Alarm Signals.
|
* SIGBUS: (libc)Program Error Signals.
|
* SIGCHLD: (libc)Job Control Signals.
|
* SIGCLD: (libc)Job Control Signals.
|
* SIGCONT: (libc)Job Control Signals.
|
* SIGEMT: (libc)Program Error Signals.
|
* SIGFPE: (libc)Program Error Signals.
|
* SIGHUP: (libc)Termination Signals.
|
* SIGILL: (libc)Program Error Signals.
|
* SIGINFO: (libc)Miscellaneous Signals.
|
* SIGINT: (libc)Termination Signals.
|
* SIGIO: (libc)Asynchronous I/O Signals.
|
* SIGIOT: (libc)Program Error Signals.
|
* SIGKILL: (libc)Termination Signals.
|
* SIGLOST: (libc)Operation Error Signals.
|
* SIGPIPE: (libc)Operation Error Signals.
|
* SIGPOLL: (libc)Asynchronous I/O Signals.
|
* SIGPROF: (libc)Alarm Signals.
|
* SIGQUIT: (libc)Termination Signals.
|
* SIGSEGV: (libc)Program Error Signals.
|
* SIGSTOP: (libc)Job Control Signals.
|
* SIGSYS: (libc)Program Error Signals.
|
* SIGTERM: (libc)Termination Signals.
|
* SIGTRAP: (libc)Program Error Signals.
|
* SIGTSTP: (libc)Job Control Signals.
|
* SIGTTIN: (libc)Job Control Signals.
|
* SIGTTOU: (libc)Job Control Signals.
|
* SIGURG: (libc)Asynchronous I/O Signals.
|
* SIGUSR1: (libc)Miscellaneous Signals.
|
* SIGUSR2: (libc)Miscellaneous Signals.
|
* SIGVTALRM: (libc)Alarm Signals.
|
* SIGWINCH: (libc)Miscellaneous Signals.
|
* SIGXCPU: (libc)Operation Error Signals.
|
* SIGXFSZ: (libc)Operation Error Signals.
|
* SIG_ERR: (libc)Basic Signal Handling.
|
* SNAN: (libc)Infinity and NaN.
|
* SNANF: (libc)Infinity and NaN.
|
* SNANL: (libc)Infinity and NaN.
|
* SOCK_DGRAM: (libc)Communication Styles.
|
* SOCK_RAW: (libc)Communication Styles.
|
* SOCK_RDM: (libc)Communication Styles.
|
* SOCK_SEQPACKET: (libc)Communication Styles.
|
* SOCK_STREAM: (libc)Communication Styles.
|
* SOL_SOCKET: (libc)Socket-Level Options.
|
* SSIZE_MAX: (libc)General Limits.
|
* STREAM_MAX: (libc)General Limits.
|
* SUN_LEN: (libc)Local Namespace Details.
|
* S_IFMT: (libc)Testing File Type.
|
* S_ISBLK: (libc)Testing File Type.
|
* S_ISCHR: (libc)Testing File Type.
|
* S_ISDIR: (libc)Testing File Type.
|
* S_ISFIFO: (libc)Testing File Type.
|
* S_ISLNK: (libc)Testing File Type.
|
* S_ISREG: (libc)Testing File Type.
|
* S_ISSOCK: (libc)Testing File Type.
|
* S_TYPEISMQ: (libc)Testing File Type.
|
* S_TYPEISSEM: (libc)Testing File Type.
|
* S_TYPEISSHM: (libc)Testing File Type.
|
* TMP_MAX: (libc)Temporary Files.
|
* TOSTOP: (libc)Local Modes.
|
* TZNAME_MAX: (libc)General Limits.
|
* VDISCARD: (libc)Other Special.
|
* VDSUSP: (libc)Signal Characters.
|
* VEOF: (libc)Editing Characters.
|
* VEOL2: (libc)Editing Characters.
|
* VEOL: (libc)Editing Characters.
|
* VERASE: (libc)Editing Characters.
|
* VINTR: (libc)Signal Characters.
|
* VKILL: (libc)Editing Characters.
|
* VLNEXT: (libc)Other Special.
|
* VMIN: (libc)Noncanonical Input.
|
* VQUIT: (libc)Signal Characters.
|
* VREPRINT: (libc)Editing Characters.
|
* VSTART: (libc)Start/Stop Characters.
|
* VSTATUS: (libc)Other Special.
|
* VSTOP: (libc)Start/Stop Characters.
|
* VSUSP: (libc)Signal Characters.
|
* VTIME: (libc)Noncanonical Input.
|
* VWERASE: (libc)Editing Characters.
|
* WCHAR_MAX: (libc)Extended Char Intro.
|
* WCHAR_MIN: (libc)Extended Char Intro.
|
* WCOREDUMP: (libc)Process Completion Status.
|
* WEOF: (libc)EOF and Errors.
|
* WEOF: (libc)Extended Char Intro.
|
* WEXITSTATUS: (libc)Process Completion Status.
|
* WIFEXITED: (libc)Process Completion Status.
|
* WIFSIGNALED: (libc)Process Completion Status.
|
* WIFSTOPPED: (libc)Process Completion Status.
|
* WSTOPSIG: (libc)Process Completion Status.
|
* WTERMSIG: (libc)Process Completion Status.
|
* W_OK: (libc)Testing File Access.
|
* X_OK: (libc)Testing File Access.
|
* _Complex_I: (libc)Complex Numbers.
|
* _Exit: (libc)Termination Internals.
|
* _IOFBF: (libc)Controlling Buffering.
|
* _IOLBF: (libc)Controlling Buffering.
|
* _IONBF: (libc)Controlling Buffering.
|
* _Imaginary_I: (libc)Complex Numbers.
|
* _PATH_UTMP: (libc)Manipulating the Database.
|
* _PATH_WTMP: (libc)Manipulating the Database.
|
* _POSIX2_C_DEV: (libc)System Options.
|
* _POSIX2_C_VERSION: (libc)Version Supported.
|
* _POSIX2_FORT_DEV: (libc)System Options.
|
* _POSIX2_FORT_RUN: (libc)System Options.
|
* _POSIX2_LOCALEDEF: (libc)System Options.
|
* _POSIX2_SW_DEV: (libc)System Options.
|
* _POSIX_CHOWN_RESTRICTED: (libc)Options for Files.
|
* _POSIX_JOB_CONTROL: (libc)System Options.
|
* _POSIX_NO_TRUNC: (libc)Options for Files.
|
* _POSIX_SAVED_IDS: (libc)System Options.
|
* _POSIX_VDISABLE: (libc)Options for Files.
|
* _POSIX_VERSION: (libc)Version Supported.
|
* __fbufsize: (libc)Controlling Buffering.
|
* __flbf: (libc)Controlling Buffering.
|
* __fpending: (libc)Controlling Buffering.
|
* __fpurge: (libc)Flushing Buffers.
|
* __freadable: (libc)Opening Streams.
|
* __freading: (libc)Opening Streams.
|
* __fsetlocking: (libc)Streams and Threads.
|
* __fwritable: (libc)Opening Streams.
|
* __fwriting: (libc)Opening Streams.
|
* __gconv_end_fct: (libc)glibc iconv Implementation.
|
* __gconv_fct: (libc)glibc iconv Implementation.
|
* __gconv_init_fct: (libc)glibc iconv Implementation.
|
* __ppc_get_timebase: (libc)PowerPC.
|
* __ppc_get_timebase_freq: (libc)PowerPC.
|
* __ppc_mdoio: (libc)PowerPC.
|
* __ppc_mdoom: (libc)PowerPC.
|
* __ppc_set_ppr_low: (libc)PowerPC.
|
* __ppc_set_ppr_med: (libc)PowerPC.
|
* __ppc_set_ppr_med_high: (libc)PowerPC.
|
* __ppc_set_ppr_med_low: (libc)PowerPC.
|
* __ppc_set_ppr_very_low: (libc)PowerPC.
|
* __ppc_yield: (libc)PowerPC.
|
* __va_copy: (libc)Argument Macros.
|
* _exit: (libc)Termination Internals.
|
* _flushlbf: (libc)Flushing Buffers.
|
* _tolower: (libc)Case Conversion.
|
* _toupper: (libc)Case Conversion.
|
* a64l: (libc)Encode Binary Data.
|
* abort: (libc)Aborting a Program.
|
* abs: (libc)Absolute Value.
|
* accept: (libc)Accepting Connections.
|
* access: (libc)Testing File Access.
|
* acos: (libc)Inverse Trig Functions.
|
* acosf: (libc)Inverse Trig Functions.
|
* acosh: (libc)Hyperbolic Functions.
|
* acoshf: (libc)Hyperbolic Functions.
|
* acoshl: (libc)Hyperbolic Functions.
|
* acosl: (libc)Inverse Trig Functions.
|
* addmntent: (libc)mtab.
|
* addseverity: (libc)Adding Severity Classes.
|
* adjtime: (libc)High-Resolution Calendar.
|
* adjtimex: (libc)High-Resolution Calendar.
|
* aio_cancel64: (libc)Cancel AIO Operations.
|
* aio_cancel: (libc)Cancel AIO Operations.
|
* aio_error64: (libc)Status of AIO Operations.
|
* aio_error: (libc)Status of AIO Operations.
|
* aio_fsync64: (libc)Synchronizing AIO Operations.
|
* aio_fsync: (libc)Synchronizing AIO Operations.
|
* aio_init: (libc)Configuration of AIO.
|
* aio_read64: (libc)Asynchronous Reads/Writes.
|
* aio_read: (libc)Asynchronous Reads/Writes.
|
* aio_return64: (libc)Status of AIO Operations.
|
* aio_return: (libc)Status of AIO Operations.
|
* aio_suspend64: (libc)Synchronizing AIO Operations.
|
* aio_suspend: (libc)Synchronizing AIO Operations.
|
* aio_write64: (libc)Asynchronous Reads/Writes.
|
* aio_write: (libc)Asynchronous Reads/Writes.
|
* alarm: (libc)Setting an Alarm.
|
* aligned_alloc: (libc)Aligned Memory Blocks.
|
* alloca: (libc)Variable Size Automatic.
|
* alphasort64: (libc)Scanning Directory Content.
|
* alphasort: (libc)Scanning Directory Content.
|
* argp_error: (libc)Argp Helper Functions.
|
* argp_failure: (libc)Argp Helper Functions.
|
* argp_help: (libc)Argp Help.
|
* argp_parse: (libc)Argp.
|
* argp_state_help: (libc)Argp Helper Functions.
|
* argp_usage: (libc)Argp Helper Functions.
|
* argz_add: (libc)Argz Functions.
|
* argz_add_sep: (libc)Argz Functions.
|
* argz_append: (libc)Argz Functions.
|
* argz_count: (libc)Argz Functions.
|
* argz_create: (libc)Argz Functions.
|
* argz_create_sep: (libc)Argz Functions.
|
* argz_delete: (libc)Argz Functions.
|
* argz_extract: (libc)Argz Functions.
|
* argz_insert: (libc)Argz Functions.
|
* argz_next: (libc)Argz Functions.
|
* argz_replace: (libc)Argz Functions.
|
* argz_stringify: (libc)Argz Functions.
|
* asctime: (libc)Formatting Calendar Time.
|
* asctime_r: (libc)Formatting Calendar Time.
|
* asin: (libc)Inverse Trig Functions.
|
* asinf: (libc)Inverse Trig Functions.
|
* asinh: (libc)Hyperbolic Functions.
|
* asinhf: (libc)Hyperbolic Functions.
|
* asinhl: (libc)Hyperbolic Functions.
|
* asinl: (libc)Inverse Trig Functions.
|
* asprintf: (libc)Dynamic Output.
|
* assert: (libc)Consistency Checking.
|
* assert_perror: (libc)Consistency Checking.
|
* atan2: (libc)Inverse Trig Functions.
|
* atan2f: (libc)Inverse Trig Functions.
|
* atan2l: (libc)Inverse Trig Functions.
|
* atan: (libc)Inverse Trig Functions.
|
* atanf: (libc)Inverse Trig Functions.
|
* atanh: (libc)Hyperbolic Functions.
|
* atanhf: (libc)Hyperbolic Functions.
|
* atanhl: (libc)Hyperbolic Functions.
|
* atanl: (libc)Inverse Trig Functions.
|
* atexit: (libc)Cleanups on Exit.
|
* atof: (libc)Parsing of Floats.
|
* atoi: (libc)Parsing of Integers.
|
* atol: (libc)Parsing of Integers.
|
* atoll: (libc)Parsing of Integers.
|
* backtrace: (libc)Backtraces.
|
* backtrace_symbols: (libc)Backtraces.
|
* backtrace_symbols_fd: (libc)Backtraces.
|
* basename: (libc)Finding Tokens in a String.
|
* basename: (libc)Finding Tokens in a String.
|
* bcmp: (libc)String/Array Comparison.
|
* bcopy: (libc)Copying Strings and Arrays.
|
* bind: (libc)Setting Address.
|
* bind_textdomain_codeset: (libc)Charset conversion in gettext.
|
* bindtextdomain: (libc)Locating gettext catalog.
|
* brk: (libc)Resizing the Data Segment.
|
* bsearch: (libc)Array Search Function.
|
* btowc: (libc)Converting a Character.
|
* bzero: (libc)Copying Strings and Arrays.
|
* cabs: (libc)Absolute Value.
|
* cabsf: (libc)Absolute Value.
|
* cabsl: (libc)Absolute Value.
|
* cacos: (libc)Inverse Trig Functions.
|
* cacosf: (libc)Inverse Trig Functions.
|
* cacosh: (libc)Hyperbolic Functions.
|
* cacoshf: (libc)Hyperbolic Functions.
|
* cacoshl: (libc)Hyperbolic Functions.
|
* cacosl: (libc)Inverse Trig Functions.
|
* calloc: (libc)Allocating Cleared Space.
|
* canonicalize: (libc)FP Bit Twiddling.
|
* canonicalize_file_name: (libc)Symbolic Links.
|
* canonicalizef: (libc)FP Bit Twiddling.
|
* canonicalizel: (libc)FP Bit Twiddling.
|
* carg: (libc)Operations on Complex.
|
* cargf: (libc)Operations on Complex.
|
* cargl: (libc)Operations on Complex.
|
* casin: (libc)Inverse Trig Functions.
|
* casinf: (libc)Inverse Trig Functions.
|
* casinh: (libc)Hyperbolic Functions.
|
* casinhf: (libc)Hyperbolic Functions.
|
* casinhl: (libc)Hyperbolic Functions.
|
* casinl: (libc)Inverse Trig Functions.
|
* catan: (libc)Inverse Trig Functions.
|
* catanf: (libc)Inverse Trig Functions.
|
* catanh: (libc)Hyperbolic Functions.
|
* catanhf: (libc)Hyperbolic Functions.
|
* catanhl: (libc)Hyperbolic Functions.
|
* catanl: (libc)Inverse Trig Functions.
|
* catclose: (libc)The catgets Functions.
|
* catgets: (libc)The catgets Functions.
|
* catopen: (libc)The catgets Functions.
|
* cbc_crypt: (libc)DES Encryption.
|
* cbrt: (libc)Exponents and Logarithms.
|
* cbrtf: (libc)Exponents and Logarithms.
|
* cbrtl: (libc)Exponents and Logarithms.
|
* ccos: (libc)Trig Functions.
|
* ccosf: (libc)Trig Functions.
|
* ccosh: (libc)Hyperbolic Functions.
|
* ccoshf: (libc)Hyperbolic Functions.
|
* ccoshl: (libc)Hyperbolic Functions.
|
* ccosl: (libc)Trig Functions.
|
* ceil: (libc)Rounding Functions.
|
* ceilf: (libc)Rounding Functions.
|
* ceill: (libc)Rounding Functions.
|
* cexp: (libc)Exponents and Logarithms.
|
* cexpf: (libc)Exponents and Logarithms.
|
* cexpl: (libc)Exponents and Logarithms.
|
* cfgetispeed: (libc)Line Speed.
|
* cfgetospeed: (libc)Line Speed.
|
* cfmakeraw: (libc)Noncanonical Input.
|
* cfree: (libc)Freeing after Malloc.
|
* cfsetispeed: (libc)Line Speed.
|
* cfsetospeed: (libc)Line Speed.
|
* cfsetspeed: (libc)Line Speed.
|
* chdir: (libc)Working Directory.
|
* chmod: (libc)Setting Permissions.
|
* chown: (libc)File Owner.
|
* cimag: (libc)Operations on Complex.
|
* cimagf: (libc)Operations on Complex.
|
* cimagl: (libc)Operations on Complex.
|
* clearenv: (libc)Environment Access.
|
* clearerr: (libc)Error Recovery.
|
* clearerr_unlocked: (libc)Error Recovery.
|
* clock: (libc)CPU Time.
|
* clog10: (libc)Exponents and Logarithms.
|
* clog10f: (libc)Exponents and Logarithms.
|
* clog10l: (libc)Exponents and Logarithms.
|
* clog: (libc)Exponents and Logarithms.
|
* clogf: (libc)Exponents and Logarithms.
|
* clogl: (libc)Exponents and Logarithms.
|
* close: (libc)Opening and Closing Files.
|
* closedir: (libc)Reading/Closing Directory.
|
* closelog: (libc)closelog.
|
* confstr: (libc)String Parameters.
|
* conj: (libc)Operations on Complex.
|
* conjf: (libc)Operations on Complex.
|
* conjl: (libc)Operations on Complex.
|
* connect: (libc)Connecting.
|
* copysign: (libc)FP Bit Twiddling.
|
* copysignf: (libc)FP Bit Twiddling.
|
* copysignl: (libc)FP Bit Twiddling.
|
* cos: (libc)Trig Functions.
|
* cosf: (libc)Trig Functions.
|
* cosh: (libc)Hyperbolic Functions.
|
* coshf: (libc)Hyperbolic Functions.
|
* coshl: (libc)Hyperbolic Functions.
|
* cosl: (libc)Trig Functions.
|
* cpow: (libc)Exponents and Logarithms.
|
* cpowf: (libc)Exponents and Logarithms.
|
* cpowl: (libc)Exponents and Logarithms.
|
* cproj: (libc)Operations on Complex.
|
* cprojf: (libc)Operations on Complex.
|
* cprojl: (libc)Operations on Complex.
|
* creal: (libc)Operations on Complex.
|
* crealf: (libc)Operations on Complex.
|
* creall: (libc)Operations on Complex.
|
* creat64: (libc)Opening and Closing Files.
|
* creat: (libc)Opening and Closing Files.
|
* crypt: (libc)crypt.
|
* crypt_r: (libc)crypt.
|
* csin: (libc)Trig Functions.
|
* csinf: (libc)Trig Functions.
|
* csinh: (libc)Hyperbolic Functions.
|
* csinhf: (libc)Hyperbolic Functions.
|
* csinhl: (libc)Hyperbolic Functions.
|
* csinl: (libc)Trig Functions.
|
* csqrt: (libc)Exponents and Logarithms.
|
* csqrtf: (libc)Exponents and Logarithms.
|
* csqrtl: (libc)Exponents and Logarithms.
|
* ctan: (libc)Trig Functions.
|
* ctanf: (libc)Trig Functions.
|
* ctanh: (libc)Hyperbolic Functions.
|
* ctanhf: (libc)Hyperbolic Functions.
|
* ctanhl: (libc)Hyperbolic Functions.
|
* ctanl: (libc)Trig Functions.
|
* ctermid: (libc)Identifying the Terminal.
|
* ctime: (libc)Formatting Calendar Time.
|
* ctime_r: (libc)Formatting Calendar Time.
|
* cuserid: (libc)Who Logged In.
|
* dcgettext: (libc)Translation with gettext.
|
* dcngettext: (libc)Advanced gettext functions.
|
* des_setparity: (libc)DES Encryption.
|
* dgettext: (libc)Translation with gettext.
|
* difftime: (libc)Elapsed Time.
|
* dirfd: (libc)Opening a Directory.
|
* dirname: (libc)Finding Tokens in a String.
|
* div: (libc)Integer Division.
|
* dngettext: (libc)Advanced gettext functions.
|
* drand48: (libc)SVID Random.
|
* drand48_r: (libc)SVID Random.
|
* drem: (libc)Remainder Functions.
|
* dremf: (libc)Remainder Functions.
|
* dreml: (libc)Remainder Functions.
|
* dup2: (libc)Duplicating Descriptors.
|
* dup: (libc)Duplicating Descriptors.
|
* ecb_crypt: (libc)DES Encryption.
|
* ecvt: (libc)System V Number Conversion.
|
* ecvt_r: (libc)System V Number Conversion.
|
* encrypt: (libc)DES Encryption.
|
* encrypt_r: (libc)DES Encryption.
|
* endfsent: (libc)fstab.
|
* endgrent: (libc)Scanning All Groups.
|
* endhostent: (libc)Host Names.
|
* endmntent: (libc)mtab.
|
* endnetent: (libc)Networks Database.
|
* endnetgrent: (libc)Lookup Netgroup.
|
* endprotoent: (libc)Protocols Database.
|
* endpwent: (libc)Scanning All Users.
|
* endservent: (libc)Services Database.
|
* endutent: (libc)Manipulating the Database.
|
* endutxent: (libc)XPG Functions.
|
* envz_add: (libc)Envz Functions.
|
* envz_entry: (libc)Envz Functions.
|
* envz_get: (libc)Envz Functions.
|
* envz_merge: (libc)Envz Functions.
|
* envz_remove: (libc)Envz Functions.
|
* envz_strip: (libc)Envz Functions.
|
* erand48: (libc)SVID Random.
|
* erand48_r: (libc)SVID Random.
|
* erf: (libc)Special Functions.
|
* erfc: (libc)Special Functions.
|
* erfcf: (libc)Special Functions.
|
* erfcl: (libc)Special Functions.
|
* erff: (libc)Special Functions.
|
* erfl: (libc)Special Functions.
|
* err: (libc)Error Messages.
|
* errno: (libc)Checking for Errors.
|
* error: (libc)Error Messages.
|
* error_at_line: (libc)Error Messages.
|
* errx: (libc)Error Messages.
|
* execl: (libc)Executing a File.
|
* execle: (libc)Executing a File.
|
* execlp: (libc)Executing a File.
|
* execv: (libc)Executing a File.
|
* execve: (libc)Executing a File.
|
* execvp: (libc)Executing a File.
|
* exit: (libc)Normal Termination.
|
* exp10: (libc)Exponents and Logarithms.
|
* exp10f: (libc)Exponents and Logarithms.
|
* exp10l: (libc)Exponents and Logarithms.
|
* exp2: (libc)Exponents and Logarithms.
|
* exp2f: (libc)Exponents and Logarithms.
|
* exp2l: (libc)Exponents and Logarithms.
|
* exp: (libc)Exponents and Logarithms.
|
* expf: (libc)Exponents and Logarithms.
|
* expl: (libc)Exponents and Logarithms.
|
* explicit_bzero: (libc)Erasing Sensitive Data.
|
* expm1: (libc)Exponents and Logarithms.
|
* expm1f: (libc)Exponents and Logarithms.
|
* expm1l: (libc)Exponents and Logarithms.
|
* fabs: (libc)Absolute Value.
|
* fabsf: (libc)Absolute Value.
|
* fabsl: (libc)Absolute Value.
|
* fchdir: (libc)Working Directory.
|
* fchmod: (libc)Setting Permissions.
|
* fchown: (libc)File Owner.
|
* fclose: (libc)Closing Streams.
|
* fcloseall: (libc)Closing Streams.
|
* fcntl: (libc)Control Operations.
|
* fcvt: (libc)System V Number Conversion.
|
* fcvt_r: (libc)System V Number Conversion.
|
* fdatasync: (libc)Synchronizing I/O.
|
* fdim: (libc)Misc FP Arithmetic.
|
* fdimf: (libc)Misc FP Arithmetic.
|
* fdiml: (libc)Misc FP Arithmetic.
|
* fdopen: (libc)Descriptors and Streams.
|
* fdopendir: (libc)Opening a Directory.
|
* feclearexcept: (libc)Status bit operations.
|
* fedisableexcept: (libc)Control Functions.
|
* feenableexcept: (libc)Control Functions.
|
* fegetenv: (libc)Control Functions.
|
* fegetexcept: (libc)Control Functions.
|
* fegetexceptflag: (libc)Status bit operations.
|
* fegetmode: (libc)Control Functions.
|
* fegetround: (libc)Rounding.
|
* feholdexcept: (libc)Control Functions.
|
* feof: (libc)EOF and Errors.
|
* feof_unlocked: (libc)EOF and Errors.
|
* feraiseexcept: (libc)Status bit operations.
|
* ferror: (libc)EOF and Errors.
|
* ferror_unlocked: (libc)EOF and Errors.
|
* fesetenv: (libc)Control Functions.
|
* fesetexcept: (libc)Status bit operations.
|
* fesetexceptflag: (libc)Status bit operations.
|
* fesetmode: (libc)Control Functions.
|
* fesetround: (libc)Rounding.
|
* fetestexcept: (libc)Status bit operations.
|
* fetestexceptflag: (libc)Status bit operations.
|
* feupdateenv: (libc)Control Functions.
|
* fflush: (libc)Flushing Buffers.
|
* fflush_unlocked: (libc)Flushing Buffers.
|
* fgetc: (libc)Character Input.
|
* fgetc_unlocked: (libc)Character Input.
|
* fgetgrent: (libc)Scanning All Groups.
|
* fgetgrent_r: (libc)Scanning All Groups.
|
* fgetpos64: (libc)Portable Positioning.
|
* fgetpos: (libc)Portable Positioning.
|
* fgetpwent: (libc)Scanning All Users.
|
* fgetpwent_r: (libc)Scanning All Users.
|
* fgets: (libc)Line Input.
|
* fgets_unlocked: (libc)Line Input.
|
* fgetwc: (libc)Character Input.
|
* fgetwc_unlocked: (libc)Character Input.
|
* fgetws: (libc)Line Input.
|
* fgetws_unlocked: (libc)Line Input.
|
* fileno: (libc)Descriptors and Streams.
|
* fileno_unlocked: (libc)Descriptors and Streams.
|
* finite: (libc)Floating Point Classes.
|
* finitef: (libc)Floating Point Classes.
|
* finitel: (libc)Floating Point Classes.
|
* flockfile: (libc)Streams and Threads.
|
* floor: (libc)Rounding Functions.
|
* floorf: (libc)Rounding Functions.
|
* floorl: (libc)Rounding Functions.
|
* fma: (libc)Misc FP Arithmetic.
|
* fmaf: (libc)Misc FP Arithmetic.
|
* fmal: (libc)Misc FP Arithmetic.
|
* fmax: (libc)Misc FP Arithmetic.
|
* fmaxf: (libc)Misc FP Arithmetic.
|
* fmaxl: (libc)Misc FP Arithmetic.
|
* fmaxmag: (libc)Misc FP Arithmetic.
|
* fmaxmagf: (libc)Misc FP Arithmetic.
|
* fmaxmagl: (libc)Misc FP Arithmetic.
|
* fmemopen: (libc)String Streams.
|
* fmin: (libc)Misc FP Arithmetic.
|
* fminf: (libc)Misc FP Arithmetic.
|
* fminl: (libc)Misc FP Arithmetic.
|
* fminmag: (libc)Misc FP Arithmetic.
|
* fminmagf: (libc)Misc FP Arithmetic.
|
* fminmagl: (libc)Misc FP Arithmetic.
|
* fmod: (libc)Remainder Functions.
|
* fmodf: (libc)Remainder Functions.
|
* fmodl: (libc)Remainder Functions.
|
* fmtmsg: (libc)Printing Formatted Messages.
|
* fnmatch: (libc)Wildcard Matching.
|
* fopen64: (libc)Opening Streams.
|
* fopen: (libc)Opening Streams.
|
* fopencookie: (libc)Streams and Cookies.
|
* fork: (libc)Creating a Process.
|
* forkpty: (libc)Pseudo-Terminal Pairs.
|
* fpathconf: (libc)Pathconf.
|
* fpclassify: (libc)Floating Point Classes.
|
* fprintf: (libc)Formatted Output Functions.
|
* fputc: (libc)Simple Output.
|
* fputc_unlocked: (libc)Simple Output.
|
* fputs: (libc)Simple Output.
|
* fputs_unlocked: (libc)Simple Output.
|
* fputwc: (libc)Simple Output.
|
* fputwc_unlocked: (libc)Simple Output.
|
* fputws: (libc)Simple Output.
|
* fputws_unlocked: (libc)Simple Output.
|
* fread: (libc)Block Input/Output.
|
* fread_unlocked: (libc)Block Input/Output.
|
* free: (libc)Freeing after Malloc.
|
* freopen64: (libc)Opening Streams.
|
* freopen: (libc)Opening Streams.
|
* frexp: (libc)Normalization Functions.
|
* frexpf: (libc)Normalization Functions.
|
* frexpl: (libc)Normalization Functions.
|
* fromfp: (libc)Rounding Functions.
|
* fromfpf: (libc)Rounding Functions.
|
* fromfpl: (libc)Rounding Functions.
|
* fromfpx: (libc)Rounding Functions.
|
* fromfpxf: (libc)Rounding Functions.
|
* fromfpxl: (libc)Rounding Functions.
|
* fscanf: (libc)Formatted Input Functions.
|
* fseek: (libc)File Positioning.
|
* fseeko64: (libc)File Positioning.
|
* fseeko: (libc)File Positioning.
|
* fsetpos64: (libc)Portable Positioning.
|
* fsetpos: (libc)Portable Positioning.
|
* fstat64: (libc)Reading Attributes.
|
* fstat: (libc)Reading Attributes.
|
* fsync: (libc)Synchronizing I/O.
|
* ftell: (libc)File Positioning.
|
* ftello64: (libc)File Positioning.
|
* ftello: (libc)File Positioning.
|
* ftruncate64: (libc)File Size.
|
* ftruncate: (libc)File Size.
|
* ftrylockfile: (libc)Streams and Threads.
|
* ftw64: (libc)Working with Directory Trees.
|
* ftw: (libc)Working with Directory Trees.
|
* funlockfile: (libc)Streams and Threads.
|
* futimes: (libc)File Times.
|
* fwide: (libc)Streams and I18N.
|
* fwprintf: (libc)Formatted Output Functions.
|
* fwrite: (libc)Block Input/Output.
|
* fwrite_unlocked: (libc)Block Input/Output.
|
* fwscanf: (libc)Formatted Input Functions.
|
* gamma: (libc)Special Functions.
|
* gammaf: (libc)Special Functions.
|
* gammal: (libc)Special Functions.
|
* gcvt: (libc)System V Number Conversion.
|
* get_avphys_pages: (libc)Query Memory Parameters.
|
* get_current_dir_name: (libc)Working Directory.
|
* get_nprocs: (libc)Processor Resources.
|
* get_nprocs_conf: (libc)Processor Resources.
|
* get_phys_pages: (libc)Query Memory Parameters.
|
* getauxval: (libc)Auxiliary Vector.
|
* getc: (libc)Character Input.
|
* getc_unlocked: (libc)Character Input.
|
* getchar: (libc)Character Input.
|
* getchar_unlocked: (libc)Character Input.
|
* getcontext: (libc)System V contexts.
|
* getcwd: (libc)Working Directory.
|
* getdate: (libc)General Time String Parsing.
|
* getdate_r: (libc)General Time String Parsing.
|
* getdelim: (libc)Line Input.
|
* getdomainnname: (libc)Host Identification.
|
* getegid: (libc)Reading Persona.
|
* getentropy: (libc)Unpredictable Bytes.
|
* getenv: (libc)Environment Access.
|
* geteuid: (libc)Reading Persona.
|
* getfsent: (libc)fstab.
|
* getfsfile: (libc)fstab.
|
* getfsspec: (libc)fstab.
|
* getgid: (libc)Reading Persona.
|
* getgrent: (libc)Scanning All Groups.
|
* getgrent_r: (libc)Scanning All Groups.
|
* getgrgid: (libc)Lookup Group.
|
* getgrgid_r: (libc)Lookup Group.
|
* getgrnam: (libc)Lookup Group.
|
* getgrnam_r: (libc)Lookup Group.
|
* getgrouplist: (libc)Setting Groups.
|
* getgroups: (libc)Reading Persona.
|
* gethostbyaddr: (libc)Host Names.
|
* gethostbyaddr_r: (libc)Host Names.
|
* gethostbyname2: (libc)Host Names.
|
* gethostbyname2_r: (libc)Host Names.
|
* gethostbyname: (libc)Host Names.
|
* gethostbyname_r: (libc)Host Names.
|
* gethostent: (libc)Host Names.
|
* gethostid: (libc)Host Identification.
|
* gethostname: (libc)Host Identification.
|
* getitimer: (libc)Setting an Alarm.
|
* getline: (libc)Line Input.
|
* getloadavg: (libc)Processor Resources.
|
* getlogin: (libc)Who Logged In.
|
* getmntent: (libc)mtab.
|
* getmntent_r: (libc)mtab.
|
* getnetbyaddr: (libc)Networks Database.
|
* getnetbyname: (libc)Networks Database.
|
* getnetent: (libc)Networks Database.
|
* getnetgrent: (libc)Lookup Netgroup.
|
* getnetgrent_r: (libc)Lookup Netgroup.
|
* getopt: (libc)Using Getopt.
|
* getopt_long: (libc)Getopt Long Options.
|
* getopt_long_only: (libc)Getopt Long Options.
|
* getpagesize: (libc)Query Memory Parameters.
|
* getpass: (libc)getpass.
|
* getpayload: (libc)FP Bit Twiddling.
|
* getpayloadf: (libc)FP Bit Twiddling.
|
* getpayloadl: (libc)FP Bit Twiddling.
|
* getpeername: (libc)Who is Connected.
|
* getpgid: (libc)Process Group Functions.
|
* getpgrp: (libc)Process Group Functions.
|
* getpid: (libc)Process Identification.
|
* getppid: (libc)Process Identification.
|
* getpriority: (libc)Traditional Scheduling Functions.
|
* getprotobyname: (libc)Protocols Database.
|
* getprotobynumber: (libc)Protocols Database.
|
* getprotoent: (libc)Protocols Database.
|
* getpt: (libc)Allocation.
|
* getpwent: (libc)Scanning All Users.
|
* getpwent_r: (libc)Scanning All Users.
|
* getpwnam: (libc)Lookup User.
|
* getpwnam_r: (libc)Lookup User.
|
* getpwuid: (libc)Lookup User.
|
* getpwuid_r: (libc)Lookup User.
|
* getrandom: (libc)Unpredictable Bytes.
|
* getrlimit64: (libc)Limits on Resources.
|
* getrlimit: (libc)Limits on Resources.
|
* getrusage: (libc)Resource Usage.
|
* gets: (libc)Line Input.
|
* getservbyname: (libc)Services Database.
|
* getservbyport: (libc)Services Database.
|
* getservent: (libc)Services Database.
|
* getsid: (libc)Process Group Functions.
|
* getsockname: (libc)Reading Address.
|
* getsockopt: (libc)Socket Option Functions.
|
* getsubopt: (libc)Suboptions.
|
* gettext: (libc)Translation with gettext.
|
* gettimeofday: (libc)High-Resolution Calendar.
|
* getuid: (libc)Reading Persona.
|
* getumask: (libc)Setting Permissions.
|
* getutent: (libc)Manipulating the Database.
|
* getutent_r: (libc)Manipulating the Database.
|
* getutid: (libc)Manipulating the Database.
|
* getutid_r: (libc)Manipulating the Database.
|
* getutline: (libc)Manipulating the Database.
|
* getutline_r: (libc)Manipulating the Database.
|
* getutmp: (libc)XPG Functions.
|
* getutmpx: (libc)XPG Functions.
|
* getutxent: (libc)XPG Functions.
|
* getutxid: (libc)XPG Functions.
|
* getutxline: (libc)XPG Functions.
|
* getw: (libc)Character Input.
|
* getwc: (libc)Character Input.
|
* getwc_unlocked: (libc)Character Input.
|
* getwchar: (libc)Character Input.
|
* getwchar_unlocked: (libc)Character Input.
|
* getwd: (libc)Working Directory.
|
* glob64: (libc)Calling Glob.
|
* glob: (libc)Calling Glob.
|
* globfree64: (libc)More Flags for Globbing.
|
* globfree: (libc)More Flags for Globbing.
|
* gmtime: (libc)Broken-down Time.
|
* gmtime_r: (libc)Broken-down Time.
|
* grantpt: (libc)Allocation.
|
* gsignal: (libc)Signaling Yourself.
|
* gtty: (libc)BSD Terminal Modes.
|
* hasmntopt: (libc)mtab.
|
* hcreate: (libc)Hash Search Function.
|
* hcreate_r: (libc)Hash Search Function.
|
* hdestroy: (libc)Hash Search Function.
|
* hdestroy_r: (libc)Hash Search Function.
|
* hsearch: (libc)Hash Search Function.
|
* hsearch_r: (libc)Hash Search Function.
|
* htonl: (libc)Byte Order.
|
* htons: (libc)Byte Order.
|
* hypot: (libc)Exponents and Logarithms.
|
* hypotf: (libc)Exponents and Logarithms.
|
* hypotl: (libc)Exponents and Logarithms.
|
* iconv: (libc)Generic Conversion Interface.
|
* iconv_close: (libc)Generic Conversion Interface.
|
* iconv_open: (libc)Generic Conversion Interface.
|
* if_freenameindex: (libc)Interface Naming.
|
* if_indextoname: (libc)Interface Naming.
|
* if_nameindex: (libc)Interface Naming.
|
* if_nametoindex: (libc)Interface Naming.
|
* ilogb: (libc)Exponents and Logarithms.
|
* ilogbf: (libc)Exponents and Logarithms.
|
* ilogbl: (libc)Exponents and Logarithms.
|
* imaxabs: (libc)Absolute Value.
|
* imaxdiv: (libc)Integer Division.
|
* in6addr_any: (libc)Host Address Data Type.
|
* in6addr_loopback: (libc)Host Address Data Type.
|
* index: (libc)Search Functions.
|
* inet_addr: (libc)Host Address Functions.
|
* inet_aton: (libc)Host Address Functions.
|
* inet_lnaof: (libc)Host Address Functions.
|
* inet_makeaddr: (libc)Host Address Functions.
|
* inet_netof: (libc)Host Address Functions.
|
* inet_network: (libc)Host Address Functions.
|
* inet_ntoa: (libc)Host Address Functions.
|
* inet_ntop: (libc)Host Address Functions.
|
* inet_pton: (libc)Host Address Functions.
|
* initgroups: (libc)Setting Groups.
|
* initstate: (libc)BSD Random.
|
* initstate_r: (libc)BSD Random.
|
* innetgr: (libc)Netgroup Membership.
|
* ioctl: (libc)IOCTLs.
|
* isalnum: (libc)Classification of Characters.
|
* isalpha: (libc)Classification of Characters.
|
* isascii: (libc)Classification of Characters.
|
* isatty: (libc)Is It a Terminal.
|
* isblank: (libc)Classification of Characters.
|
* iscanonical: (libc)Floating Point Classes.
|
* iscntrl: (libc)Classification of Characters.
|
* isdigit: (libc)Classification of Characters.
|
* iseqsig: (libc)FP Comparison Functions.
|
* isfinite: (libc)Floating Point Classes.
|
* isgraph: (libc)Classification of Characters.
|
* isgreater: (libc)FP Comparison Functions.
|
* isgreaterequal: (libc)FP Comparison Functions.
|
* isinf: (libc)Floating Point Classes.
|
* isinff: (libc)Floating Point Classes.
|
* isinfl: (libc)Floating Point Classes.
|
* isless: (libc)FP Comparison Functions.
|
* islessequal: (libc)FP Comparison Functions.
|
* islessgreater: (libc)FP Comparison Functions.
|
* islower: (libc)Classification of Characters.
|
* isnan: (libc)Floating Point Classes.
|
* isnan: (libc)Floating Point Classes.
|
* isnanf: (libc)Floating Point Classes.
|
* isnanl: (libc)Floating Point Classes.
|
* isnormal: (libc)Floating Point Classes.
|
* isprint: (libc)Classification of Characters.
|
* ispunct: (libc)Classification of Characters.
|
* issignaling: (libc)Floating Point Classes.
|
* isspace: (libc)Classification of Characters.
|
* issubnormal: (libc)Floating Point Classes.
|
* isunordered: (libc)FP Comparison Functions.
|
* isupper: (libc)Classification of Characters.
|
* iswalnum: (libc)Classification of Wide Characters.
|
* iswalpha: (libc)Classification of Wide Characters.
|
* iswblank: (libc)Classification of Wide Characters.
|
* iswcntrl: (libc)Classification of Wide Characters.
|
* iswctype: (libc)Classification of Wide Characters.
|
* iswdigit: (libc)Classification of Wide Characters.
|
* iswgraph: (libc)Classification of Wide Characters.
|
* iswlower: (libc)Classification of Wide Characters.
|
* iswprint: (libc)Classification of Wide Characters.
|
* iswpunct: (libc)Classification of Wide Characters.
|
* iswspace: (libc)Classification of Wide Characters.
|
* iswupper: (libc)Classification of Wide Characters.
|
* iswxdigit: (libc)Classification of Wide Characters.
|
* isxdigit: (libc)Classification of Characters.
|
* iszero: (libc)Floating Point Classes.
|
* j0: (libc)Special Functions.
|
* j0f: (libc)Special Functions.
|
* j0l: (libc)Special Functions.
|
* j1: (libc)Special Functions.
|
* j1f: (libc)Special Functions.
|
* j1l: (libc)Special Functions.
|
* jn: (libc)Special Functions.
|
* jnf: (libc)Special Functions.
|
* jnl: (libc)Special Functions.
|
* jrand48: (libc)SVID Random.
|
* jrand48_r: (libc)SVID Random.
|
* kill: (libc)Signaling Another Process.
|
* killpg: (libc)Signaling Another Process.
|
* l64a: (libc)Encode Binary Data.
|
* labs: (libc)Absolute Value.
|
* lcong48: (libc)SVID Random.
|
* lcong48_r: (libc)SVID Random.
|
* ldexp: (libc)Normalization Functions.
|
* ldexpf: (libc)Normalization Functions.
|
* ldexpl: (libc)Normalization Functions.
|
* ldiv: (libc)Integer Division.
|
* lfind: (libc)Array Search Function.
|
* lgamma: (libc)Special Functions.
|
* lgamma_r: (libc)Special Functions.
|
* lgammaf: (libc)Special Functions.
|
* lgammaf_r: (libc)Special Functions.
|
* lgammal: (libc)Special Functions.
|
* lgammal_r: (libc)Special Functions.
|
* link: (libc)Hard Links.
|
* lio_listio64: (libc)Asynchronous Reads/Writes.
|
* lio_listio: (libc)Asynchronous Reads/Writes.
|
* listen: (libc)Listening.
|
* llabs: (libc)Absolute Value.
|
* lldiv: (libc)Integer Division.
|
* llogb: (libc)Exponents and Logarithms.
|
* llogbf: (libc)Exponents and Logarithms.
|
* llogbl: (libc)Exponents and Logarithms.
|
* llrint: (libc)Rounding Functions.
|
* llrintf: (libc)Rounding Functions.
|
* llrintl: (libc)Rounding Functions.
|
* llround: (libc)Rounding Functions.
|
* llroundf: (libc)Rounding Functions.
|
* llroundl: (libc)Rounding Functions.
|
* localeconv: (libc)The Lame Way to Locale Data.
|
* localtime: (libc)Broken-down Time.
|
* localtime_r: (libc)Broken-down Time.
|
* log10: (libc)Exponents and Logarithms.
|
* log10f: (libc)Exponents and Logarithms.
|
* log10l: (libc)Exponents and Logarithms.
|
* log1p: (libc)Exponents and Logarithms.
|
* log1pf: (libc)Exponents and Logarithms.
|
* log1pl: (libc)Exponents and Logarithms.
|
* log2: (libc)Exponents and Logarithms.
|
* log2f: (libc)Exponents and Logarithms.
|
* log2l: (libc)Exponents and Logarithms.
|
* log: (libc)Exponents and Logarithms.
|
* logb: (libc)Exponents and Logarithms.
|
* logbf: (libc)Exponents and Logarithms.
|
* logbl: (libc)Exponents and Logarithms.
|
* logf: (libc)Exponents and Logarithms.
|
* login: (libc)Logging In and Out.
|
* login_tty: (libc)Logging In and Out.
|
* logl: (libc)Exponents and Logarithms.
|
* logout: (libc)Logging In and Out.
|
* logwtmp: (libc)Logging In and Out.
|
* longjmp: (libc)Non-Local Details.
|
* lrand48: (libc)SVID Random.
|
* lrand48_r: (libc)SVID Random.
|
* lrint: (libc)Rounding Functions.
|
* lrintf: (libc)Rounding Functions.
|
* lrintl: (libc)Rounding Functions.
|
* lround: (libc)Rounding Functions.
|
* lroundf: (libc)Rounding Functions.
|
* lroundl: (libc)Rounding Functions.
|
* lsearch: (libc)Array Search Function.
|
* lseek64: (libc)File Position Primitive.
|
* lseek: (libc)File Position Primitive.
|
* lstat64: (libc)Reading Attributes.
|
* lstat: (libc)Reading Attributes.
|
* lutimes: (libc)File Times.
|
* madvise: (libc)Memory-mapped I/O.
|
* makecontext: (libc)System V contexts.
|
* mallinfo: (libc)Statistics of Malloc.
|
* malloc: (libc)Basic Allocation.
|
* mallopt: (libc)Malloc Tunable Parameters.
|
* mblen: (libc)Non-reentrant Character Conversion.
|
* mbrlen: (libc)Converting a Character.
|
* mbrtowc: (libc)Converting a Character.
|
* mbsinit: (libc)Keeping the state.
|
* mbsnrtowcs: (libc)Converting Strings.
|
* mbsrtowcs: (libc)Converting Strings.
|
* mbstowcs: (libc)Non-reentrant String Conversion.
|
* mbtowc: (libc)Non-reentrant Character Conversion.
|
* mcheck: (libc)Heap Consistency Checking.
|
* memalign: (libc)Aligned Memory Blocks.
|
* memccpy: (libc)Copying Strings and Arrays.
|
* memchr: (libc)Search Functions.
|
* memcmp: (libc)String/Array Comparison.
|
* memcpy: (libc)Copying Strings and Arrays.
|
* memfrob: (libc)Trivial Encryption.
|
* memmem: (libc)Search Functions.
|
* memmove: (libc)Copying Strings and Arrays.
|
* mempcpy: (libc)Copying Strings and Arrays.
|
* memrchr: (libc)Search Functions.
|
* memset: (libc)Copying Strings and Arrays.
|
* mkdir: (libc)Creating Directories.
|
* mkdtemp: (libc)Temporary Files.
|
* mkfifo: (libc)FIFO Special Files.
|
* mknod: (libc)Making Special Files.
|
* mkstemp: (libc)Temporary Files.
|
* mktemp: (libc)Temporary Files.
|
* mktime: (libc)Broken-down Time.
|
* mlock: (libc)Page Lock Functions.
|
* mlockall: (libc)Page Lock Functions.
|
* mmap64: (libc)Memory-mapped I/O.
|
* mmap: (libc)Memory-mapped I/O.
|
* modf: (libc)Rounding Functions.
|
* modff: (libc)Rounding Functions.
|
* modfl: (libc)Rounding Functions.
|
* mount: (libc)Mount-Unmount-Remount.
|
* mprobe: (libc)Heap Consistency Checking.
|
* mrand48: (libc)SVID Random.
|
* mrand48_r: (libc)SVID Random.
|
* mremap: (libc)Memory-mapped I/O.
|
* msync: (libc)Memory-mapped I/O.
|
* mtrace: (libc)Tracing malloc.
|
* munlock: (libc)Page Lock Functions.
|
* munlockall: (libc)Page Lock Functions.
|
* munmap: (libc)Memory-mapped I/O.
|
* muntrace: (libc)Tracing malloc.
|
* nan: (libc)FP Bit Twiddling.
|
* nanf: (libc)FP Bit Twiddling.
|
* nanl: (libc)FP Bit Twiddling.
|
* nanosleep: (libc)Sleeping.
|
* nearbyint: (libc)Rounding Functions.
|
* nearbyintf: (libc)Rounding Functions.
|
* nearbyintl: (libc)Rounding Functions.
|
* nextafter: (libc)FP Bit Twiddling.
|
* nextafterf: (libc)FP Bit Twiddling.
|
* nextafterl: (libc)FP Bit Twiddling.
|
* nextdown: (libc)FP Bit Twiddling.
|
* nextdownf: (libc)FP Bit Twiddling.
|
* nextdownl: (libc)FP Bit Twiddling.
|
* nexttoward: (libc)FP Bit Twiddling.
|
* nexttowardf: (libc)FP Bit Twiddling.
|
* nexttowardl: (libc)FP Bit Twiddling.
|
* nextup: (libc)FP Bit Twiddling.
|
* nextupf: (libc)FP Bit Twiddling.
|
* nextupl: (libc)FP Bit Twiddling.
|
* nftw64: (libc)Working with Directory Trees.
|
* nftw: (libc)Working with Directory Trees.
|
* ngettext: (libc)Advanced gettext functions.
|
* nice: (libc)Traditional Scheduling Functions.
|
* nl_langinfo: (libc)The Elegant and Fast Way.
|
* nrand48: (libc)SVID Random.
|
* nrand48_r: (libc)SVID Random.
|
* ntohl: (libc)Byte Order.
|
* ntohs: (libc)Byte Order.
|
* ntp_adjtime: (libc)High Accuracy Clock.
|
* ntp_gettime: (libc)High Accuracy Clock.
|
* obstack_1grow: (libc)Growing Objects.
|
* obstack_1grow_fast: (libc)Extra Fast Growing.
|
* obstack_alignment_mask: (libc)Obstacks Data Alignment.
|
* obstack_alloc: (libc)Allocation in an Obstack.
|
* obstack_base: (libc)Status of an Obstack.
|
* obstack_blank: (libc)Growing Objects.
|
* obstack_blank_fast: (libc)Extra Fast Growing.
|
* obstack_chunk_size: (libc)Obstack Chunks.
|
* obstack_copy0: (libc)Allocation in an Obstack.
|
* obstack_copy: (libc)Allocation in an Obstack.
|
* obstack_finish: (libc)Growing Objects.
|
* obstack_free: (libc)Freeing Obstack Objects.
|
* obstack_grow0: (libc)Growing Objects.
|
* obstack_grow: (libc)Growing Objects.
|
* obstack_init: (libc)Preparing for Obstacks.
|
* obstack_int_grow: (libc)Growing Objects.
|
* obstack_int_grow_fast: (libc)Extra Fast Growing.
|
* obstack_next_free: (libc)Status of an Obstack.
|
* obstack_object_size: (libc)Growing Objects.
|
* obstack_object_size: (libc)Status of an Obstack.
|
* obstack_printf: (libc)Dynamic Output.
|
* obstack_ptr_grow: (libc)Growing Objects.
|
* obstack_ptr_grow_fast: (libc)Extra Fast Growing.
|
* obstack_room: (libc)Extra Fast Growing.
|
* obstack_vprintf: (libc)Variable Arguments Output.
|
* offsetof: (libc)Structure Measurement.
|
* on_exit: (libc)Cleanups on Exit.
|
* open64: (libc)Opening and Closing Files.
|
* open: (libc)Opening and Closing Files.
|
* open_memstream: (libc)String Streams.
|
* opendir: (libc)Opening a Directory.
|
* openlog: (libc)openlog.
|
* openpty: (libc)Pseudo-Terminal Pairs.
|
* parse_printf_format: (libc)Parsing a Template String.
|
* pathconf: (libc)Pathconf.
|
* pause: (libc)Using Pause.
|
* pclose: (libc)Pipe to a Subprocess.
|
* perror: (libc)Error Messages.
|
* pipe: (libc)Creating a Pipe.
|
* popen: (libc)Pipe to a Subprocess.
|
* posix_fallocate64: (libc)Storage Allocation.
|
* posix_fallocate: (libc)Storage Allocation.
|
* posix_memalign: (libc)Aligned Memory Blocks.
|
* pow10: (libc)Exponents and Logarithms.
|
* pow10f: (libc)Exponents and Logarithms.
|
* pow10l: (libc)Exponents and Logarithms.
|
* pow: (libc)Exponents and Logarithms.
|
* powf: (libc)Exponents and Logarithms.
|
* powl: (libc)Exponents and Logarithms.
|
* pread64: (libc)I/O Primitives.
|
* pread: (libc)I/O Primitives.
|
* printf: (libc)Formatted Output Functions.
|
* printf_size: (libc)Predefined Printf Handlers.
|
* printf_size_info: (libc)Predefined Printf Handlers.
|
* psignal: (libc)Signal Messages.
|
* pthread_getattr_default_np: (libc)Default Thread Attributes.
|
* pthread_getspecific: (libc)Thread-specific Data.
|
* pthread_key_create: (libc)Thread-specific Data.
|
* pthread_key_delete: (libc)Thread-specific Data.
|
* pthread_setattr_default_np: (libc)Default Thread Attributes.
|
* pthread_setspecific: (libc)Thread-specific Data.
|
* ptsname: (libc)Allocation.
|
* ptsname_r: (libc)Allocation.
|
* putc: (libc)Simple Output.
|
* putc_unlocked: (libc)Simple Output.
|
* putchar: (libc)Simple Output.
|
* putchar_unlocked: (libc)Simple Output.
|
* putenv: (libc)Environment Access.
|
* putpwent: (libc)Writing a User Entry.
|
* puts: (libc)Simple Output.
|
* pututline: (libc)Manipulating the Database.
|
* pututxline: (libc)XPG Functions.
|
* putw: (libc)Simple Output.
|
* putwc: (libc)Simple Output.
|
* putwc_unlocked: (libc)Simple Output.
|
* putwchar: (libc)Simple Output.
|
* putwchar_unlocked: (libc)Simple Output.
|
* pwrite64: (libc)I/O Primitives.
|
* pwrite: (libc)I/O Primitives.
|
* qecvt: (libc)System V Number Conversion.
|
* qecvt_r: (libc)System V Number Conversion.
|
* qfcvt: (libc)System V Number Conversion.
|
* qfcvt_r: (libc)System V Number Conversion.
|
* qgcvt: (libc)System V Number Conversion.
|
* qsort: (libc)Array Sort Function.
|
* raise: (libc)Signaling Yourself.
|
* rand: (libc)ISO Random.
|
* rand_r: (libc)ISO Random.
|
* random: (libc)BSD Random.
|
* random_r: (libc)BSD Random.
|
* rawmemchr: (libc)Search Functions.
|
* read: (libc)I/O Primitives.
|
* readdir64: (libc)Reading/Closing Directory.
|
* readdir64_r: (libc)Reading/Closing Directory.
|
* readdir: (libc)Reading/Closing Directory.
|
* readdir_r: (libc)Reading/Closing Directory.
|
* readlink: (libc)Symbolic Links.
|
* readv: (libc)Scatter-Gather.
|
* realloc: (libc)Changing Block Size.
|
* realpath: (libc)Symbolic Links.
|
* recv: (libc)Receiving Data.
|
* recvfrom: (libc)Receiving Datagrams.
|
* recvmsg: (libc)Receiving Datagrams.
|
* regcomp: (libc)POSIX Regexp Compilation.
|
* regerror: (libc)Regexp Cleanup.
|
* regexec: (libc)Matching POSIX Regexps.
|
* regfree: (libc)Regexp Cleanup.
|
* register_printf_function: (libc)Registering New Conversions.
|
* remainder: (libc)Remainder Functions.
|
* remainderf: (libc)Remainder Functions.
|
* remainderl: (libc)Remainder Functions.
|
* remove: (libc)Deleting Files.
|
* rename: (libc)Renaming Files.
|
* rewind: (libc)File Positioning.
|
* rewinddir: (libc)Random Access Directory.
|
* rindex: (libc)Search Functions.
|
* rint: (libc)Rounding Functions.
|
* rintf: (libc)Rounding Functions.
|
* rintl: (libc)Rounding Functions.
|
* rmdir: (libc)Deleting Files.
|
* round: (libc)Rounding Functions.
|
* roundeven: (libc)Rounding Functions.
|
* roundevenf: (libc)Rounding Functions.
|
* roundevenl: (libc)Rounding Functions.
|
* roundf: (libc)Rounding Functions.
|
* roundl: (libc)Rounding Functions.
|
* rpmatch: (libc)Yes-or-No Questions.
|
* sbrk: (libc)Resizing the Data Segment.
|
* scalb: (libc)Normalization Functions.
|
* scalbf: (libc)Normalization Functions.
|
* scalbl: (libc)Normalization Functions.
|
* scalbln: (libc)Normalization Functions.
|
* scalblnf: (libc)Normalization Functions.
|
* scalblnl: (libc)Normalization Functions.
|
* scalbn: (libc)Normalization Functions.
|
* scalbnf: (libc)Normalization Functions.
|
* scalbnl: (libc)Normalization Functions.
|
* scandir64: (libc)Scanning Directory Content.
|
* scandir: (libc)Scanning Directory Content.
|
* scanf: (libc)Formatted Input Functions.
|
* sched_get_priority_max: (libc)Basic Scheduling Functions.
|
* sched_get_priority_min: (libc)Basic Scheduling Functions.
|
* sched_getaffinity: (libc)CPU Affinity.
|
* sched_getparam: (libc)Basic Scheduling Functions.
|
* sched_getscheduler: (libc)Basic Scheduling Functions.
|
* sched_rr_get_interval: (libc)Basic Scheduling Functions.
|
* sched_setaffinity: (libc)CPU Affinity.
|
* sched_setparam: (libc)Basic Scheduling Functions.
|
* sched_setscheduler: (libc)Basic Scheduling Functions.
|
* sched_yield: (libc)Basic Scheduling Functions.
|
* secure_getenv: (libc)Environment Access.
|
* seed48: (libc)SVID Random.
|
* seed48_r: (libc)SVID Random.
|
* seekdir: (libc)Random Access Directory.
|
* select: (libc)Waiting for I/O.
|
* sem_close: (libc)Semaphores.
|
* sem_destroy: (libc)Semaphores.
|
* sem_getvalue: (libc)Semaphores.
|
* sem_init: (libc)Semaphores.
|
* sem_open: (libc)Semaphores.
|
* sem_post: (libc)Semaphores.
|
* sem_timedwait: (libc)Semaphores.
|
* sem_trywait: (libc)Semaphores.
|
* sem_unlink: (libc)Semaphores.
|
* sem_wait: (libc)Semaphores.
|
* semctl: (libc)Semaphores.
|
* semget: (libc)Semaphores.
|
* semop: (libc)Semaphores.
|
* semtimedop: (libc)Semaphores.
|
* send: (libc)Sending Data.
|
* sendmsg: (libc)Receiving Datagrams.
|
* sendto: (libc)Sending Datagrams.
|
* setbuf: (libc)Controlling Buffering.
|
* setbuffer: (libc)Controlling Buffering.
|
* setcontext: (libc)System V contexts.
|
* setdomainname: (libc)Host Identification.
|
* setegid: (libc)Setting Groups.
|
* setenv: (libc)Environment Access.
|
* seteuid: (libc)Setting User ID.
|
* setfsent: (libc)fstab.
|
* setgid: (libc)Setting Groups.
|
* setgrent: (libc)Scanning All Groups.
|
* setgroups: (libc)Setting Groups.
|
* sethostent: (libc)Host Names.
|
* sethostid: (libc)Host Identification.
|
* sethostname: (libc)Host Identification.
|
* setitimer: (libc)Setting an Alarm.
|
* setjmp: (libc)Non-Local Details.
|
* setkey: (libc)DES Encryption.
|
* setkey_r: (libc)DES Encryption.
|
* setlinebuf: (libc)Controlling Buffering.
|
* setlocale: (libc)Setting the Locale.
|
* setlogmask: (libc)setlogmask.
|
* setmntent: (libc)mtab.
|
* setnetent: (libc)Networks Database.
|
* setnetgrent: (libc)Lookup Netgroup.
|
* setpayload: (libc)FP Bit Twiddling.
|
* setpayloadf: (libc)FP Bit Twiddling.
|
* setpayloadl: (libc)FP Bit Twiddling.
|
* setpayloadsig: (libc)FP Bit Twiddling.
|
* setpayloadsigf: (libc)FP Bit Twiddling.
|
* setpayloadsigl: (libc)FP Bit Twiddling.
|
* setpgid: (libc)Process Group Functions.
|
* setpgrp: (libc)Process Group Functions.
|
* setpriority: (libc)Traditional Scheduling Functions.
|
* setprotoent: (libc)Protocols Database.
|
* setpwent: (libc)Scanning All Users.
|
* setregid: (libc)Setting Groups.
|
* setreuid: (libc)Setting User ID.
|
* setrlimit64: (libc)Limits on Resources.
|
* setrlimit: (libc)Limits on Resources.
|
* setservent: (libc)Services Database.
|
* setsid: (libc)Process Group Functions.
|
* setsockopt: (libc)Socket Option Functions.
|
* setstate: (libc)BSD Random.
|
* setstate_r: (libc)BSD Random.
|
* settimeofday: (libc)High-Resolution Calendar.
|
* setuid: (libc)Setting User ID.
|
* setutent: (libc)Manipulating the Database.
|
* setutxent: (libc)XPG Functions.
|
* setvbuf: (libc)Controlling Buffering.
|
* shm_open: (libc)Memory-mapped I/O.
|
* shm_unlink: (libc)Memory-mapped I/O.
|
* shutdown: (libc)Closing a Socket.
|
* sigaction: (libc)Advanced Signal Handling.
|
* sigaddset: (libc)Signal Sets.
|
* sigaltstack: (libc)Signal Stack.
|
* sigblock: (libc)BSD Signal Handling.
|
* sigdelset: (libc)Signal Sets.
|
* sigemptyset: (libc)Signal Sets.
|
* sigfillset: (libc)Signal Sets.
|
* siginterrupt: (libc)BSD Signal Handling.
|
* sigismember: (libc)Signal Sets.
|
* siglongjmp: (libc)Non-Local Exits and Signals.
|
* sigmask: (libc)BSD Signal Handling.
|
* signal: (libc)Basic Signal Handling.
|
* signbit: (libc)FP Bit Twiddling.
|
* significand: (libc)Normalization Functions.
|
* significandf: (libc)Normalization Functions.
|
* significandl: (libc)Normalization Functions.
|
* sigpause: (libc)BSD Signal Handling.
|
* sigpending: (libc)Checking for Pending Signals.
|
* sigprocmask: (libc)Process Signal Mask.
|
* sigsetjmp: (libc)Non-Local Exits and Signals.
|
* sigsetmask: (libc)BSD Signal Handling.
|
* sigstack: (libc)Signal Stack.
|
* sigsuspend: (libc)Sigsuspend.
|
* sin: (libc)Trig Functions.
|
* sincos: (libc)Trig Functions.
|
* sincosf: (libc)Trig Functions.
|
* sincosl: (libc)Trig Functions.
|
* sinf: (libc)Trig Functions.
|
* sinh: (libc)Hyperbolic Functions.
|
* sinhf: (libc)Hyperbolic Functions.
|
* sinhl: (libc)Hyperbolic Functions.
|
* sinl: (libc)Trig Functions.
|
* sleep: (libc)Sleeping.
|
* snprintf: (libc)Formatted Output Functions.
|
* socket: (libc)Creating a Socket.
|
* socketpair: (libc)Socket Pairs.
|
* sprintf: (libc)Formatted Output Functions.
|
* sqrt: (libc)Exponents and Logarithms.
|
* sqrtf: (libc)Exponents and Logarithms.
|
* sqrtl: (libc)Exponents and Logarithms.
|
* srand48: (libc)SVID Random.
|
* srand48_r: (libc)SVID Random.
|
* srand: (libc)ISO Random.
|
* srandom: (libc)BSD Random.
|
* srandom_r: (libc)BSD Random.
|
* sscanf: (libc)Formatted Input Functions.
|
* ssignal: (libc)Basic Signal Handling.
|
* stat64: (libc)Reading Attributes.
|
* stat: (libc)Reading Attributes.
|
* stime: (libc)Simple Calendar Time.
|
* stpcpy: (libc)Copying Strings and Arrays.
|
* stpncpy: (libc)Truncating Strings.
|
* strcasecmp: (libc)String/Array Comparison.
|
* strcasestr: (libc)Search Functions.
|
* strcat: (libc)Concatenating Strings.
|
* strchr: (libc)Search Functions.
|
* strchrnul: (libc)Search Functions.
|
* strcmp: (libc)String/Array Comparison.
|
* strcoll: (libc)Collation Functions.
|
* strcpy: (libc)Copying Strings and Arrays.
|
* strcspn: (libc)Search Functions.
|
* strdup: (libc)Copying Strings and Arrays.
|
* strdupa: (libc)Copying Strings and Arrays.
|
* strerror: (libc)Error Messages.
|
* strerror_r: (libc)Error Messages.
|
* strfmon: (libc)Formatting Numbers.
|
* strfromd: (libc)Printing of Floats.
|
* strfromf: (libc)Printing of Floats.
|
* strfroml: (libc)Printing of Floats.
|
* strfry: (libc)strfry.
|
* strftime: (libc)Formatting Calendar Time.
|
* strlen: (libc)String Length.
|
* strncasecmp: (libc)String/Array Comparison.
|
* strncat: (libc)Truncating Strings.
|
* strncmp: (libc)String/Array Comparison.
|
* strncpy: (libc)Truncating Strings.
|
* strndup: (libc)Truncating Strings.
|
* strndupa: (libc)Truncating Strings.
|
* strnlen: (libc)String Length.
|
* strpbrk: (libc)Search Functions.
|
* strptime: (libc)Low-Level Time String Parsing.
|
* strrchr: (libc)Search Functions.
|
* strsep: (libc)Finding Tokens in a String.
|
* strsignal: (libc)Signal Messages.
|
* strspn: (libc)Search Functions.
|
* strstr: (libc)Search Functions.
|
* strtod: (libc)Parsing of Floats.
|
* strtof: (libc)Parsing of Floats.
|
* strtoimax: (libc)Parsing of Integers.
|
* strtok: (libc)Finding Tokens in a String.
|
* strtok_r: (libc)Finding Tokens in a String.
|
* strtol: (libc)Parsing of Integers.
|
* strtold: (libc)Parsing of Floats.
|
* strtoll: (libc)Parsing of Integers.
|
* strtoq: (libc)Parsing of Integers.
|
* strtoul: (libc)Parsing of Integers.
|
* strtoull: (libc)Parsing of Integers.
|
* strtoumax: (libc)Parsing of Integers.
|
* strtouq: (libc)Parsing of Integers.
|
* strverscmp: (libc)String/Array Comparison.
|
* strxfrm: (libc)Collation Functions.
|
* stty: (libc)BSD Terminal Modes.
|
* swapcontext: (libc)System V contexts.
|
* swprintf: (libc)Formatted Output Functions.
|
* swscanf: (libc)Formatted Input Functions.
|
* symlink: (libc)Symbolic Links.
|
* sync: (libc)Synchronizing I/O.
|
* syscall: (libc)System Calls.
|
* sysconf: (libc)Sysconf Definition.
|
* sysctl: (libc)System Parameters.
|
* syslog: (libc)syslog; vsyslog.
|
* system: (libc)Running a Command.
|
* sysv_signal: (libc)Basic Signal Handling.
|
* tan: (libc)Trig Functions.
|
* tanf: (libc)Trig Functions.
|
* tanh: (libc)Hyperbolic Functions.
|
* tanhf: (libc)Hyperbolic Functions.
|
* tanhl: (libc)Hyperbolic Functions.
|
* tanl: (libc)Trig Functions.
|
* tcdrain: (libc)Line Control.
|
* tcflow: (libc)Line Control.
|
* tcflush: (libc)Line Control.
|
* tcgetattr: (libc)Mode Functions.
|
* tcgetpgrp: (libc)Terminal Access Functions.
|
* tcgetsid: (libc)Terminal Access Functions.
|
* tcsendbreak: (libc)Line Control.
|
* tcsetattr: (libc)Mode Functions.
|
* tcsetpgrp: (libc)Terminal Access Functions.
|
* tdelete: (libc)Tree Search Function.
|
* tdestroy: (libc)Tree Search Function.
|
* telldir: (libc)Random Access Directory.
|
* tempnam: (libc)Temporary Files.
|
* textdomain: (libc)Locating gettext catalog.
|
* tfind: (libc)Tree Search Function.
|
* tgamma: (libc)Special Functions.
|
* tgammaf: (libc)Special Functions.
|
* tgammal: (libc)Special Functions.
|
* time: (libc)Simple Calendar Time.
|
* timegm: (libc)Broken-down Time.
|
* timelocal: (libc)Broken-down Time.
|
* times: (libc)Processor Time.
|
* tmpfile64: (libc)Temporary Files.
|
* tmpfile: (libc)Temporary Files.
|
* tmpnam: (libc)Temporary Files.
|
* tmpnam_r: (libc)Temporary Files.
|
* toascii: (libc)Case Conversion.
|
* tolower: (libc)Case Conversion.
|
* totalorder: (libc)FP Comparison Functions.
|
* totalorderf: (libc)FP Comparison Functions.
|
* totalorderl: (libc)FP Comparison Functions.
|
* totalordermag: (libc)FP Comparison Functions.
|
* totalordermagf: (libc)FP Comparison Functions.
|
* totalordermagl: (libc)FP Comparison Functions.
|
* toupper: (libc)Case Conversion.
|
* towctrans: (libc)Wide Character Case Conversion.
|
* towlower: (libc)Wide Character Case Conversion.
|
* towupper: (libc)Wide Character Case Conversion.
|
* trunc: (libc)Rounding Functions.
|
* truncate64: (libc)File Size.
|
* truncate: (libc)File Size.
|
* truncf: (libc)Rounding Functions.
|
* truncl: (libc)Rounding Functions.
|
* tsearch: (libc)Tree Search Function.
|
* ttyname: (libc)Is It a Terminal.
|
* ttyname_r: (libc)Is It a Terminal.
|
* twalk: (libc)Tree Search Function.
|
* tzset: (libc)Time Zone Functions.
|
* ufromfp: (libc)Rounding Functions.
|
* ufromfpf: (libc)Rounding Functions.
|
* ufromfpl: (libc)Rounding Functions.
|
* ufromfpx: (libc)Rounding Functions.
|
* ufromfpxf: (libc)Rounding Functions.
|
* ufromfpxl: (libc)Rounding Functions.
|
* ulimit: (libc)Limits on Resources.
|
* umask: (libc)Setting Permissions.
|
* umount2: (libc)Mount-Unmount-Remount.
|
* umount: (libc)Mount-Unmount-Remount.
|
* uname: (libc)Platform Type.
|
* ungetc: (libc)How Unread.
|
* ungetwc: (libc)How Unread.
|
* unlink: (libc)Deleting Files.
|
* unlockpt: (libc)Allocation.
|
* unsetenv: (libc)Environment Access.
|
* updwtmp: (libc)Manipulating the Database.
|
* utime: (libc)File Times.
|
* utimes: (libc)File Times.
|
* utmpname: (libc)Manipulating the Database.
|
* utmpxname: (libc)XPG Functions.
|
* va_arg: (libc)Argument Macros.
|
* va_copy: (libc)Argument Macros.
|
* va_end: (libc)Argument Macros.
|
* va_start: (libc)Argument Macros.
|
* valloc: (libc)Aligned Memory Blocks.
|
* vasprintf: (libc)Variable Arguments Output.
|
* verr: (libc)Error Messages.
|
* verrx: (libc)Error Messages.
|
* versionsort64: (libc)Scanning Directory Content.
|
* versionsort: (libc)Scanning Directory Content.
|
* vfork: (libc)Creating a Process.
|
* vfprintf: (libc)Variable Arguments Output.
|
* vfscanf: (libc)Variable Arguments Input.
|
* vfwprintf: (libc)Variable Arguments Output.
|
* vfwscanf: (libc)Variable Arguments Input.
|
* vlimit: (libc)Limits on Resources.
|
* vprintf: (libc)Variable Arguments Output.
|
* vscanf: (libc)Variable Arguments Input.
|
* vsnprintf: (libc)Variable Arguments Output.
|
* vsprintf: (libc)Variable Arguments Output.
|
* vsscanf: (libc)Variable Arguments Input.
|
* vswprintf: (libc)Variable Arguments Output.
|
* vswscanf: (libc)Variable Arguments Input.
|
* vsyslog: (libc)syslog; vsyslog.
|
* vtimes: (libc)Resource Usage.
|
* vwarn: (libc)Error Messages.
|
* vwarnx: (libc)Error Messages.
|
* vwprintf: (libc)Variable Arguments Output.
|
* vwscanf: (libc)Variable Arguments Input.
|
* wait3: (libc)BSD Wait Functions.
|
* wait4: (libc)Process Completion.
|
* wait: (libc)Process Completion.
|
* waitpid: (libc)Process Completion.
|
* warn: (libc)Error Messages.
|
* warnx: (libc)Error Messages.
|
* wcpcpy: (libc)Copying Strings and Arrays.
|
* wcpncpy: (libc)Truncating Strings.
|
* wcrtomb: (libc)Converting a Character.
|
* wcscasecmp: (libc)String/Array Comparison.
|
* wcscat: (libc)Concatenating Strings.
|
* wcschr: (libc)Search Functions.
|
* wcschrnul: (libc)Search Functions.
|
* wcscmp: (libc)String/Array Comparison.
|
* wcscoll: (libc)Collation Functions.
|
* wcscpy: (libc)Copying Strings and Arrays.
|
* wcscspn: (libc)Search Functions.
|
* wcsdup: (libc)Copying Strings and Arrays.
|
* wcsftime: (libc)Formatting Calendar Time.
|
* wcslen: (libc)String Length.
|
* wcsncasecmp: (libc)String/Array Comparison.
|
* wcsncat: (libc)Truncating Strings.
|
* wcsncmp: (libc)String/Array Comparison.
|
* wcsncpy: (libc)Truncating Strings.
|
* wcsnlen: (libc)String Length.
|
* wcsnrtombs: (libc)Converting Strings.
|
* wcspbrk: (libc)Search Functions.
|
* wcsrchr: (libc)Search Functions.
|
* wcsrtombs: (libc)Converting Strings.
|
* wcsspn: (libc)Search Functions.
|
* wcsstr: (libc)Search Functions.
|
* wcstod: (libc)Parsing of Floats.
|
* wcstof: (libc)Parsing of Floats.
|
* wcstoimax: (libc)Parsing of Integers.
|
* wcstok: (libc)Finding Tokens in a String.
|
* wcstol: (libc)Parsing of Integers.
|
* wcstold: (libc)Parsing of Floats.
|
* wcstoll: (libc)Parsing of Integers.
|
* wcstombs: (libc)Non-reentrant String Conversion.
|
* wcstoq: (libc)Parsing of Integers.
|
* wcstoul: (libc)Parsing of Integers.
|
* wcstoull: (libc)Parsing of Integers.
|
* wcstoumax: (libc)Parsing of Integers.
|
* wcstouq: (libc)Parsing of Integers.
|
* wcswcs: (libc)Search Functions.
|
* wcsxfrm: (libc)Collation Functions.
|
* wctob: (libc)Converting a Character.
|
* wctomb: (libc)Non-reentrant Character Conversion.
|
* wctrans: (libc)Wide Character Case Conversion.
|
* wctype: (libc)Classification of Wide Characters.
|
* wmemchr: (libc)Search Functions.
|
* wmemcmp: (libc)String/Array Comparison.
|
* wmemcpy: (libc)Copying Strings and Arrays.
|
* wmemmove: (libc)Copying Strings and Arrays.
|
* wmempcpy: (libc)Copying Strings and Arrays.
|
* wmemset: (libc)Copying Strings and Arrays.
|
* wordexp: (libc)Calling Wordexp.
|
* wordfree: (libc)Calling Wordexp.
|
* wprintf: (libc)Formatted Output Functions.
|
* write: (libc)I/O Primitives.
|
* writev: (libc)Scatter-Gather.
|
* wscanf: (libc)Formatted Input Functions.
|
* y0: (libc)Special Functions.
|
* y0f: (libc)Special Functions.
|
* y0l: (libc)Special Functions.
|
* y1: (libc)Special Functions.
|
* y1f: (libc)Special Functions.
|
* y1l: (libc)Special Functions.
|
* yn: (libc)Special Functions.
|
* ynf: (libc)Special Functions.
|
* ynl: (libc)Special Functions.
|
END-INFO-DIR-ENTRY
|
|
|
File: libc.info, Node: Obstacks, Next: Variable Size Automatic, Prev: Allocation Debugging, Up: Memory Allocation
|
|
3.2.5 Obstacks
|
--------------
|
|
An "obstack" is a pool of memory containing a stack of objects. You can
|
create any number of separate obstacks, and then allocate objects in
|
specified obstacks. Within each obstack, the last object allocated must
|
always be the first one freed, but distinct obstacks are independent of
|
each other.
|
|
Aside from this one constraint of order of freeing, obstacks are
|
totally general: an obstack can contain any number of objects of any
|
size. They are implemented with macros, so allocation is usually very
|
fast as long as the objects are usually small. And the only space
|
overhead per object is the padding needed to start each object on a
|
suitable boundary.
|
|
* Menu:
|
|
* Creating Obstacks:: How to declare an obstack in your program.
|
* Preparing for Obstacks:: Preparations needed before you can
|
use obstacks.
|
* Allocation in an Obstack:: Allocating objects in an obstack.
|
* Freeing Obstack Objects:: Freeing objects in an obstack.
|
* Obstack Functions:: The obstack functions are both
|
functions and macros.
|
* Growing Objects:: Making an object bigger by stages.
|
* Extra Fast Growing:: Extra-high-efficiency (though more
|
complicated) growing objects.
|
* Status of an Obstack:: Inquiries about the status of an obstack.
|
* Obstacks Data Alignment:: Controlling alignment of objects in obstacks.
|
* Obstack Chunks:: How obstacks obtain and release chunks;
|
efficiency considerations.
|
* Summary of Obstacks::
|
|
|
File: libc.info, Node: Creating Obstacks, Next: Preparing for Obstacks, Up: Obstacks
|
|
3.2.5.1 Creating Obstacks
|
.........................
|
|
The utilities for manipulating obstacks are declared in the header file
|
‘obstack.h’.
|
|
-- Data Type: struct obstack
|
An obstack is represented by a data structure of type ‘struct
|
obstack’. This structure has a small fixed size; it records the
|
status of the obstack and how to find the space in which objects
|
are allocated. It does not contain any of the objects themselves.
|
You should not try to access the contents of the structure
|
directly; use only the functions described in this chapter.
|
|
You can declare variables of type ‘struct obstack’ and use them as
|
obstacks, or you can allocate obstacks dynamically like any other kind
|
of object. Dynamic allocation of obstacks allows your program to have a
|
variable number of different stacks. (You can even allocate an obstack
|
structure in another obstack, but this is rarely useful.)
|
|
All the functions that work with obstacks require you to specify
|
which obstack to use. You do this with a pointer of type ‘struct
|
obstack *’. In the following, we often say “an obstack” when strictly
|
speaking the object at hand is such a pointer.
|
|
The objects in the obstack are packed into large blocks called
|
"chunks". The ‘struct obstack’ structure points to a chain of the
|
chunks currently in use.
|
|
The obstack library obtains a new chunk whenever you allocate an
|
object that won’t fit in the previous chunk. Since the obstack library
|
manages chunks automatically, you don’t need to pay much attention to
|
them, but you do need to supply a function which the obstack library
|
should use to get a chunk. Usually you supply a function which uses
|
‘malloc’ directly or indirectly. You must also supply a function to
|
free a chunk. These matters are described in the following section.
|
|
|
File: libc.info, Node: Preparing for Obstacks, Next: Allocation in an Obstack, Prev: Creating Obstacks, Up: Obstacks
|
|
3.2.5.2 Preparing for Using Obstacks
|
....................................
|
|
Each source file in which you plan to use the obstack functions must
|
include the header file ‘obstack.h’, like this:
|
|
#include <obstack.h>
|
|
Also, if the source file uses the macro ‘obstack_init’, it must
|
declare or define two functions or macros that will be called by the
|
obstack library. One, ‘obstack_chunk_alloc’, is used to allocate the
|
chunks of memory into which objects are packed. The other,
|
‘obstack_chunk_free’, is used to return chunks when the objects in them
|
are freed. These macros should appear before any use of obstacks in the
|
source file.
|
|
Usually these are defined to use ‘malloc’ via the intermediary
|
‘xmalloc’ (*note Unconstrained Allocation::). This is done with the
|
following pair of macro definitions:
|
|
#define obstack_chunk_alloc xmalloc
|
#define obstack_chunk_free free
|
|
Though the memory you get using obstacks really comes from ‘malloc’,
|
using obstacks is faster because ‘malloc’ is called less often, for
|
larger blocks of memory. *Note Obstack Chunks::, for full details.
|
|
At run time, before the program can use a ‘struct obstack’ object as
|
an obstack, it must initialize the obstack by calling ‘obstack_init’.
|
|
-- Function: int obstack_init (struct obstack *OBSTACK-PTR)
|
Preliminary: | MT-Safe race:obstack-ptr | AS-Safe | AC-Safe mem |
|
*Note POSIX Safety Concepts::.
|
|
Initialize obstack OBSTACK-PTR for allocation of objects. This
|
function calls the obstack’s ‘obstack_chunk_alloc’ function. If
|
allocation of memory fails, the function pointed to by
|
‘obstack_alloc_failed_handler’ is called. The ‘obstack_init’
|
function always returns 1 (Compatibility notice: Former versions of
|
obstack returned 0 if allocation failed).
|
|
Here are two examples of how to allocate the space for an obstack and
|
initialize it. First, an obstack that is a static variable:
|
|
static struct obstack myobstack;
|
…
|
obstack_init (&myobstack);
|
|
Second, an obstack that is itself dynamically allocated:
|
|
struct obstack *myobstack_ptr
|
= (struct obstack *) xmalloc (sizeof (struct obstack));
|
|
obstack_init (myobstack_ptr);
|
|
-- Variable: obstack_alloc_failed_handler
|
The value of this variable is a pointer to a function that
|
‘obstack’ uses when ‘obstack_chunk_alloc’ fails to allocate memory.
|
The default action is to print a message and abort. You should
|
supply a function that either calls ‘exit’ (*note Program
|
Termination::) or ‘longjmp’ (*note Non-Local Exits::) and doesn’t
|
return.
|
|
void my_obstack_alloc_failed (void)
|
…
|
obstack_alloc_failed_handler = &my_obstack_alloc_failed;
|
|
|
File: libc.info, Node: Allocation in an Obstack, Next: Freeing Obstack Objects, Prev: Preparing for Obstacks, Up: Obstacks
|
|
3.2.5.3 Allocation in an Obstack
|
................................
|
|
The most direct way to allocate an object in an obstack is with
|
‘obstack_alloc’, which is invoked almost like ‘malloc’.
|
|
-- Function: void * obstack_alloc (struct obstack *OBSTACK-PTR, int
|
SIZE)
|
Preliminary: | MT-Safe race:obstack-ptr | AS-Safe | AC-Unsafe
|
corrupt mem | *Note POSIX Safety Concepts::.
|
|
This allocates an uninitialized block of SIZE bytes in an obstack
|
and returns its address. Here OBSTACK-PTR specifies which obstack
|
to allocate the block in; it is the address of the ‘struct obstack’
|
object which represents the obstack. Each obstack function or
|
macro requires you to specify an OBSTACK-PTR as the first argument.
|
|
This function calls the obstack’s ‘obstack_chunk_alloc’ function if
|
it needs to allocate a new chunk of memory; it calls
|
‘obstack_alloc_failed_handler’ if allocation of memory by
|
‘obstack_chunk_alloc’ failed.
|
|
For example, here is a function that allocates a copy of a string STR
|
in a specific obstack, which is in the variable ‘string_obstack’:
|
|
struct obstack string_obstack;
|
|
char *
|
copystring (char *string)
|
{
|
size_t len = strlen (string) + 1;
|
char *s = (char *) obstack_alloc (&string_obstack, len);
|
memcpy (s, string, len);
|
return s;
|
}
|
|
To allocate a block with specified contents, use the function
|
‘obstack_copy’, declared like this:
|
|
-- Function: void * obstack_copy (struct obstack *OBSTACK-PTR, void
|
*ADDRESS, int SIZE)
|
Preliminary: | MT-Safe race:obstack-ptr | AS-Safe | AC-Unsafe
|
corrupt mem | *Note POSIX Safety Concepts::.
|
|
This allocates a block and initializes it by copying SIZE bytes of
|
data starting at ADDRESS. It calls ‘obstack_alloc_failed_handler’
|
if allocation of memory by ‘obstack_chunk_alloc’ failed.
|
|
-- Function: void * obstack_copy0 (struct obstack *OBSTACK-PTR, void
|
*ADDRESS, int SIZE)
|
Preliminary: | MT-Safe race:obstack-ptr | AS-Safe | AC-Unsafe
|
corrupt mem | *Note POSIX Safety Concepts::.
|
|
Like ‘obstack_copy’, but appends an extra byte containing a null
|
character. This extra byte is not counted in the argument SIZE.
|
|
The ‘obstack_copy0’ function is convenient for copying a sequence of
|
characters into an obstack as a null-terminated string. Here is an
|
example of its use:
|
|
char *
|
obstack_savestring (char *addr, int size)
|
{
|
return obstack_copy0 (&myobstack, addr, size);
|
}
|
|
Contrast this with the previous example of ‘savestring’ using ‘malloc’
|
(*note Basic Allocation::).
|
|
|
File: libc.info, Node: Freeing Obstack Objects, Next: Obstack Functions, Prev: Allocation in an Obstack, Up: Obstacks
|
|
3.2.5.4 Freeing Objects in an Obstack
|
.....................................
|
|
To free an object allocated in an obstack, use the function
|
‘obstack_free’. Since the obstack is a stack of objects, freeing one
|
object automatically frees all other objects allocated more recently in
|
the same obstack.
|
|
-- Function: void obstack_free (struct obstack *OBSTACK-PTR, void
|
*OBJECT)
|
Preliminary: | MT-Safe race:obstack-ptr | AS-Safe | AC-Unsafe
|
corrupt | *Note POSIX Safety Concepts::.
|
|
If OBJECT is a null pointer, everything allocated in the obstack is
|
freed. Otherwise, OBJECT must be the address of an object
|
allocated in the obstack. Then OBJECT is freed, along with
|
everything allocated in OBSTACK-PTR since OBJECT.
|
|
Note that if OBJECT is a null pointer, the result is an uninitialized
|
obstack. To free all memory in an obstack but leave it valid for
|
further allocation, call ‘obstack_free’ with the address of the first
|
object allocated on the obstack:
|
|
obstack_free (obstack_ptr, first_object_allocated_ptr);
|
|
Recall that the objects in an obstack are grouped into chunks. When
|
all the objects in a chunk become free, the obstack library
|
automatically frees the chunk (*note Preparing for Obstacks::). Then
|
other obstacks, or non-obstack allocation, can reuse the space of the
|
chunk.
|
|
|
File: libc.info, Node: Obstack Functions, Next: Growing Objects, Prev: Freeing Obstack Objects, Up: Obstacks
|
|
3.2.5.5 Obstack Functions and Macros
|
....................................
|
|
The interfaces for using obstacks may be defined either as functions or
|
as macros, depending on the compiler. The obstack facility works with
|
all C compilers, including both ISO C and traditional C, but there are
|
precautions you must take if you plan to use compilers other than GNU C.
|
|
If you are using an old-fashioned non-ISO C compiler, all the obstack
|
“functions” are actually defined only as macros. You can call these
|
macros like functions, but you cannot use them in any other way (for
|
example, you cannot take their address).
|
|
Calling the macros requires a special precaution: namely, the first
|
operand (the obstack pointer) may not contain any side effects, because
|
it may be computed more than once. For example, if you write this:
|
|
obstack_alloc (get_obstack (), 4);
|
|
you will find that ‘get_obstack’ may be called several times. If you
|
use ‘*obstack_list_ptr++’ as the obstack pointer argument, you will get
|
very strange results since the incrementation may occur several times.
|
|
In ISO C, each function has both a macro definition and a function
|
definition. The function definition is used if you take the address of
|
the function without calling it. An ordinary call uses the macro
|
definition by default, but you can request the function definition
|
instead by writing the function name in parentheses, as shown here:
|
|
char *x;
|
void *(*funcp) ();
|
/* Use the macro. */
|
x = (char *) obstack_alloc (obptr, size);
|
/* Call the function. */
|
x = (char *) (obstack_alloc) (obptr, size);
|
/* Take the address of the function. */
|
funcp = obstack_alloc;
|
|
This is the same situation that exists in ISO C for the standard library
|
functions. *Note Macro Definitions::.
|
|
*Warning:* When you do use the macros, you must observe the
|
precaution of avoiding side effects in the first operand, even in ISO C.
|
|
If you use the GNU C compiler, this precaution is not necessary,
|
because various language extensions in GNU C permit defining the macros
|
so as to compute each argument only once.
|
|
|
File: libc.info, Node: Growing Objects, Next: Extra Fast Growing, Prev: Obstack Functions, Up: Obstacks
|
|
3.2.5.6 Growing Objects
|
.......................
|
|
Because memory in obstack chunks is used sequentially, it is possible to
|
build up an object step by step, adding one or more bytes at a time to
|
the end of the object. With this technique, you do not need to know how
|
much data you will put in the object until you come to the end of it.
|
We call this the technique of "growing objects". The special functions
|
for adding data to the growing object are described in this section.
|
|
You don’t need to do anything special when you start to grow an
|
object. Using one of the functions to add data to the object
|
automatically starts it. However, it is necessary to say explicitly
|
when the object is finished. This is done with the function
|
‘obstack_finish’.
|
|
The actual address of the object thus built up is not known until the
|
object is finished. Until then, it always remains possible that you
|
will add so much data that the object must be copied into a new chunk.
|
|
While the obstack is in use for a growing object, you cannot use it
|
for ordinary allocation of another object. If you try to do so, the
|
space already added to the growing object will become part of the other
|
object.
|
|
-- Function: void obstack_blank (struct obstack *OBSTACK-PTR, int SIZE)
|
Preliminary: | MT-Safe race:obstack-ptr | AS-Safe | AC-Unsafe
|
corrupt mem | *Note POSIX Safety Concepts::.
|
|
The most basic function for adding to a growing object is
|
‘obstack_blank’, which adds space without initializing it.
|
|
-- Function: void obstack_grow (struct obstack *OBSTACK-PTR, void
|
*DATA, int SIZE)
|
Preliminary: | MT-Safe race:obstack-ptr | AS-Safe | AC-Unsafe
|
corrupt mem | *Note POSIX Safety Concepts::.
|
|
To add a block of initialized space, use ‘obstack_grow’, which is
|
the growing-object analogue of ‘obstack_copy’. It adds SIZE bytes
|
of data to the growing object, copying the contents from DATA.
|
|
-- Function: void obstack_grow0 (struct obstack *OBSTACK-PTR, void
|
*DATA, int SIZE)
|
Preliminary: | MT-Safe race:obstack-ptr | AS-Safe | AC-Unsafe
|
corrupt mem | *Note POSIX Safety Concepts::.
|
|
This is the growing-object analogue of ‘obstack_copy0’. It adds
|
SIZE bytes copied from DATA, followed by an additional null
|
character.
|
|
-- Function: void obstack_1grow (struct obstack *OBSTACK-PTR, char C)
|
Preliminary: | MT-Safe race:obstack-ptr | AS-Safe | AC-Unsafe
|
corrupt mem | *Note POSIX Safety Concepts::.
|
|
To add one character at a time, use the function ‘obstack_1grow’.
|
It adds a single byte containing C to the growing object.
|
|
-- Function: void obstack_ptr_grow (struct obstack *OBSTACK-PTR, void
|
*DATA)
|
Preliminary: | MT-Safe race:obstack-ptr | AS-Safe | AC-Unsafe
|
corrupt mem | *Note POSIX Safety Concepts::.
|
|
Adding the value of a pointer one can use the function
|
‘obstack_ptr_grow’. It adds ‘sizeof (void *)’ bytes containing the
|
value of DATA.
|
|
-- Function: void obstack_int_grow (struct obstack *OBSTACK-PTR, int
|
DATA)
|
Preliminary: | MT-Safe race:obstack-ptr | AS-Safe | AC-Unsafe
|
corrupt mem | *Note POSIX Safety Concepts::.
|
|
A single value of type ‘int’ can be added by using the
|
‘obstack_int_grow’ function. It adds ‘sizeof (int)’ bytes to the
|
growing object and initializes them with the value of DATA.
|
|
-- Function: void * obstack_finish (struct obstack *OBSTACK-PTR)
|
Preliminary: | MT-Safe race:obstack-ptr | AS-Safe | AC-Unsafe
|
corrupt | *Note POSIX Safety Concepts::.
|
|
When you are finished growing the object, use the function
|
‘obstack_finish’ to close it off and return its final address.
|
|
Once you have finished the object, the obstack is available for
|
ordinary allocation or for growing another object.
|
|
This function can return a null pointer under the same conditions
|
as ‘obstack_alloc’ (*note Allocation in an Obstack::).
|
|
When you build an object by growing it, you will probably need to
|
know afterward how long it became. You need not keep track of this as
|
you grow the object, because you can find out the length from the
|
obstack just before finishing the object with the function
|
‘obstack_object_size’, declared as follows:
|
|
-- Function: int obstack_object_size (struct obstack *OBSTACK-PTR)
|
Preliminary: | MT-Safe race:obstack-ptr | AS-Safe | AC-Safe | *Note
|
POSIX Safety Concepts::.
|
|
This function returns the current size of the growing object, in
|
bytes. Remember to call this function _before_ finishing the
|
object. After it is finished, ‘obstack_object_size’ will return
|
zero.
|
|
If you have started growing an object and wish to cancel it, you
|
should finish it and then free it, like this:
|
|
obstack_free (obstack_ptr, obstack_finish (obstack_ptr));
|
|
This has no effect if no object was growing.
|
|
You can use ‘obstack_blank’ with a negative size argument to make the
|
current object smaller. Just don’t try to shrink it beyond zero
|
length—there’s no telling what will happen if you do that.
|
|
|
File: libc.info, Node: Extra Fast Growing, Next: Status of an Obstack, Prev: Growing Objects, Up: Obstacks
|
|
3.2.5.7 Extra Fast Growing Objects
|
..................................
|
|
The usual functions for growing objects incur overhead for checking
|
whether there is room for the new growth in the current chunk. If you
|
are frequently constructing objects in small steps of growth, this
|
overhead can be significant.
|
|
You can reduce the overhead by using special “fast growth” functions
|
that grow the object without checking. In order to have a robust
|
program, you must do the checking yourself. If you do this checking in
|
the simplest way each time you are about to add data to the object, you
|
have not saved anything, because that is what the ordinary growth
|
functions do. But if you can arrange to check less often, or check more
|
efficiently, then you make the program faster.
|
|
The function ‘obstack_room’ returns the amount of room available in
|
the current chunk. It is declared as follows:
|
|
-- Function: int obstack_room (struct obstack *OBSTACK-PTR)
|
Preliminary: | MT-Safe race:obstack-ptr | AS-Safe | AC-Safe | *Note
|
POSIX Safety Concepts::.
|
|
This returns the number of bytes that can be added safely to the
|
current growing object (or to an object about to be started) in
|
obstack OBSTACK-PTR using the fast growth functions.
|
|
While you know there is room, you can use these fast growth functions
|
for adding data to a growing object:
|
|
-- Function: void obstack_1grow_fast (struct obstack *OBSTACK-PTR, char
|
C)
|
Preliminary: | MT-Safe race:obstack-ptr | AS-Safe | AC-Unsafe
|
corrupt mem | *Note POSIX Safety Concepts::.
|
|
The function ‘obstack_1grow_fast’ adds one byte containing the
|
character C to the growing object in obstack OBSTACK-PTR.
|
|
-- Function: void obstack_ptr_grow_fast (struct obstack *OBSTACK-PTR,
|
void *DATA)
|
Preliminary: | MT-Safe race:obstack-ptr | AS-Safe | AC-Safe | *Note
|
POSIX Safety Concepts::.
|
|
The function ‘obstack_ptr_grow_fast’ adds ‘sizeof (void *)’ bytes
|
containing the value of DATA to the growing object in obstack
|
OBSTACK-PTR.
|
|
-- Function: void obstack_int_grow_fast (struct obstack *OBSTACK-PTR,
|
int DATA)
|
Preliminary: | MT-Safe race:obstack-ptr | AS-Safe | AC-Safe | *Note
|
POSIX Safety Concepts::.
|
|
The function ‘obstack_int_grow_fast’ adds ‘sizeof (int)’ bytes
|
containing the value of DATA to the growing object in obstack
|
OBSTACK-PTR.
|
|
-- Function: void obstack_blank_fast (struct obstack *OBSTACK-PTR, int
|
SIZE)
|
Preliminary: | MT-Safe race:obstack-ptr | AS-Safe | AC-Safe | *Note
|
POSIX Safety Concepts::.
|
|
The function ‘obstack_blank_fast’ adds SIZE bytes to the growing
|
object in obstack OBSTACK-PTR without initializing them.
|
|
When you check for space using ‘obstack_room’ and there is not enough
|
room for what you want to add, the fast growth functions are not safe.
|
In this case, simply use the corresponding ordinary growth function
|
instead. Very soon this will copy the object to a new chunk; then there
|
will be lots of room available again.
|
|
So, each time you use an ordinary growth function, check afterward
|
for sufficient space using ‘obstack_room’. Once the object is copied to
|
a new chunk, there will be plenty of space again, so the program will
|
start using the fast growth functions again.
|
|
Here is an example:
|
|
void
|
add_string (struct obstack *obstack, const char *ptr, int len)
|
{
|
while (len > 0)
|
{
|
int room = obstack_room (obstack);
|
if (room == 0)
|
{
|
/* Not enough room. Add one character slowly,
|
which may copy to a new chunk and make room. */
|
obstack_1grow (obstack, *ptr++);
|
len--;
|
}
|
else
|
{
|
if (room > len)
|
room = len;
|
/* Add fast as much as we have room for. */
|
len -= room;
|
while (room-- > 0)
|
obstack_1grow_fast (obstack, *ptr++);
|
}
|
}
|
}
|
|
|
File: libc.info, Node: Status of an Obstack, Next: Obstacks Data Alignment, Prev: Extra Fast Growing, Up: Obstacks
|
|
3.2.5.8 Status of an Obstack
|
............................
|
|
Here are functions that provide information on the current status of
|
allocation in an obstack. You can use them to learn about an object
|
while still growing it.
|
|
-- Function: void * obstack_base (struct obstack *OBSTACK-PTR)
|
Preliminary: | MT-Safe | AS-Unsafe corrupt | AC-Safe | *Note POSIX
|
Safety Concepts::.
|
|
This function returns the tentative address of the beginning of the
|
currently growing object in OBSTACK-PTR. If you finish the object
|
immediately, it will have that address. If you make it larger
|
first, it may outgrow the current chunk—then its address will
|
change!
|
|
If no object is growing, this value says where the next object you
|
allocate will start (once again assuming it fits in the current
|
chunk).
|
|
-- Function: void * obstack_next_free (struct obstack *OBSTACK-PTR)
|
Preliminary: | MT-Safe | AS-Unsafe corrupt | AC-Safe | *Note POSIX
|
Safety Concepts::.
|
|
This function returns the address of the first free byte in the
|
current chunk of obstack OBSTACK-PTR. This is the end of the
|
currently growing object. If no object is growing,
|
‘obstack_next_free’ returns the same value as ‘obstack_base’.
|
|
-- Function: int obstack_object_size (struct obstack *OBSTACK-PTR)
|
Preliminary: | MT-Safe race:obstack-ptr | AS-Safe | AC-Safe | *Note
|
POSIX Safety Concepts::.
|
|
This function returns the size in bytes of the currently growing
|
object. This is equivalent to
|
|
obstack_next_free (OBSTACK-PTR) - obstack_base (OBSTACK-PTR)
|
|
|
File: libc.info, Node: Obstacks Data Alignment, Next: Obstack Chunks, Prev: Status of an Obstack, Up: Obstacks
|
|
3.2.5.9 Alignment of Data in Obstacks
|
.....................................
|
|
Each obstack has an "alignment boundary"; each object allocated in the
|
obstack automatically starts on an address that is a multiple of the
|
specified boundary. By default, this boundary is aligned so that the
|
object can hold any type of data.
|
|
To access an obstack’s alignment boundary, use the macro
|
‘obstack_alignment_mask’, whose function prototype looks like this:
|
|
-- Macro: int obstack_alignment_mask (struct obstack *OBSTACK-PTR)
|
Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety
|
Concepts::.
|
|
The value is a bit mask; a bit that is 1 indicates that the
|
corresponding bit in the address of an object should be 0. The
|
mask value should be one less than a power of 2; the effect is that
|
all object addresses are multiples of that power of 2. The default
|
value of the mask is a value that allows aligned objects to hold
|
any type of data: for example, if its value is 3, any type of data
|
can be stored at locations whose addresses are multiples of 4. A
|
mask value of 0 means an object can start on any multiple of 1
|
(that is, no alignment is required).
|
|
The expansion of the macro ‘obstack_alignment_mask’ is an lvalue,
|
so you can alter the mask by assignment. For example, this
|
statement:
|
|
obstack_alignment_mask (obstack_ptr) = 0;
|
|
has the effect of turning off alignment processing in the specified
|
obstack.
|
|
Note that a change in alignment mask does not take effect until
|
_after_ the next time an object is allocated or finished in the obstack.
|
If you are not growing an object, you can make the new alignment mask
|
take effect immediately by calling ‘obstack_finish’. This will finish a
|
zero-length object and then do proper alignment for the next object.
|
|
|
File: libc.info, Node: Obstack Chunks, Next: Summary of Obstacks, Prev: Obstacks Data Alignment, Up: Obstacks
|
|
3.2.5.10 Obstack Chunks
|
.......................
|
|
Obstacks work by allocating space for themselves in large chunks, and
|
then parceling out space in the chunks to satisfy your requests. Chunks
|
are normally 4096 bytes long unless you specify a different chunk size.
|
The chunk size includes 8 bytes of overhead that are not actually used
|
for storing objects. Regardless of the specified size, longer chunks
|
will be allocated when necessary for long objects.
|
|
The obstack library allocates chunks by calling the function
|
‘obstack_chunk_alloc’, which you must define. When a chunk is no longer
|
needed because you have freed all the objects in it, the obstack library
|
frees the chunk by calling ‘obstack_chunk_free’, which you must also
|
define.
|
|
These two must be defined (as macros) or declared (as functions) in
|
each source file that uses ‘obstack_init’ (*note Creating Obstacks::).
|
Most often they are defined as macros like this:
|
|
#define obstack_chunk_alloc malloc
|
#define obstack_chunk_free free
|
|
Note that these are simple macros (no arguments). Macro definitions
|
with arguments will not work! It is necessary that
|
‘obstack_chunk_alloc’ or ‘obstack_chunk_free’, alone, expand into a
|
function name if it is not itself a function name.
|
|
If you allocate chunks with ‘malloc’, the chunk size should be a
|
power of 2. The default chunk size, 4096, was chosen because it is long
|
enough to satisfy many typical requests on the obstack yet short enough
|
not to waste too much memory in the portion of the last chunk not yet
|
used.
|
|
-- Macro: int obstack_chunk_size (struct obstack *OBSTACK-PTR)
|
Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety
|
Concepts::.
|
|
This returns the chunk size of the given obstack.
|
|
Since this macro expands to an lvalue, you can specify a new chunk
|
size by assigning it a new value. Doing so does not affect the chunks
|
already allocated, but will change the size of chunks allocated for that
|
particular obstack in the future. It is unlikely to be useful to make
|
the chunk size smaller, but making it larger might improve efficiency if
|
you are allocating many objects whose size is comparable to the chunk
|
size. Here is how to do so cleanly:
|
|
if (obstack_chunk_size (obstack_ptr) < NEW-CHUNK-SIZE)
|
obstack_chunk_size (obstack_ptr) = NEW-CHUNK-SIZE;
|
|
|
File: libc.info, Node: Summary of Obstacks, Prev: Obstack Chunks, Up: Obstacks
|
|
3.2.5.11 Summary of Obstack Functions
|
.....................................
|
|
Here is a summary of all the functions associated with obstacks. Each
|
takes the address of an obstack (‘struct obstack *’) as its first
|
argument.
|
|
‘void obstack_init (struct obstack *OBSTACK-PTR)’
|
Initialize use of an obstack. *Note Creating Obstacks::.
|
|
‘void *obstack_alloc (struct obstack *OBSTACK-PTR, int SIZE)’
|
Allocate an object of SIZE uninitialized bytes. *Note Allocation
|
in an Obstack::.
|
|
‘void *obstack_copy (struct obstack *OBSTACK-PTR, void *ADDRESS, int SIZE)’
|
Allocate an object of SIZE bytes, with contents copied from
|
ADDRESS. *Note Allocation in an Obstack::.
|
|
‘void *obstack_copy0 (struct obstack *OBSTACK-PTR, void *ADDRESS, int SIZE)’
|
Allocate an object of SIZE+1 bytes, with SIZE of them copied from
|
ADDRESS, followed by a null character at the end. *Note Allocation
|
in an Obstack::.
|
|
‘void obstack_free (struct obstack *OBSTACK-PTR, void *OBJECT)’
|
Free OBJECT (and everything allocated in the specified obstack more
|
recently than OBJECT). *Note Freeing Obstack Objects::.
|
|
‘void obstack_blank (struct obstack *OBSTACK-PTR, int SIZE)’
|
Add SIZE uninitialized bytes to a growing object. *Note Growing
|
Objects::.
|
|
‘void obstack_grow (struct obstack *OBSTACK-PTR, void *ADDRESS, int SIZE)’
|
Add SIZE bytes, copied from ADDRESS, to a growing object. *Note
|
Growing Objects::.
|
|
‘void obstack_grow0 (struct obstack *OBSTACK-PTR, void *ADDRESS, int SIZE)’
|
Add SIZE bytes, copied from ADDRESS, to a growing object, and then
|
add another byte containing a null character. *Note Growing
|
Objects::.
|
|
‘void obstack_1grow (struct obstack *OBSTACK-PTR, char DATA-CHAR)’
|
Add one byte containing DATA-CHAR to a growing object. *Note
|
Growing Objects::.
|
|
‘void *obstack_finish (struct obstack *OBSTACK-PTR)’
|
Finalize the object that is growing and return its permanent
|
address. *Note Growing Objects::.
|
|
‘int obstack_object_size (struct obstack *OBSTACK-PTR)’
|
Get the current size of the currently growing object. *Note
|
Growing Objects::.
|
|
‘void obstack_blank_fast (struct obstack *OBSTACK-PTR, int SIZE)’
|
Add SIZE uninitialized bytes to a growing object without checking
|
that there is enough room. *Note Extra Fast Growing::.
|
|
‘void obstack_1grow_fast (struct obstack *OBSTACK-PTR, char DATA-CHAR)’
|
Add one byte containing DATA-CHAR to a growing object without
|
checking that there is enough room. *Note Extra Fast Growing::.
|
|
‘int obstack_room (struct obstack *OBSTACK-PTR)’
|
Get the amount of room now available for growing the current
|
object. *Note Extra Fast Growing::.
|
|
‘int obstack_alignment_mask (struct obstack *OBSTACK-PTR)’
|
The mask used for aligning the beginning of an object. This is an
|
lvalue. *Note Obstacks Data Alignment::.
|
|
‘int obstack_chunk_size (struct obstack *OBSTACK-PTR)’
|
The size for allocating chunks. This is an lvalue. *Note Obstack
|
Chunks::.
|
|
‘void *obstack_base (struct obstack *OBSTACK-PTR)’
|
Tentative starting address of the currently growing object. *Note
|
Status of an Obstack::.
|
|
‘void *obstack_next_free (struct obstack *OBSTACK-PTR)’
|
Address just after the end of the currently growing object. *Note
|
Status of an Obstack::.
|
|
|
File: libc.info, Node: Variable Size Automatic, Prev: Obstacks, Up: Memory Allocation
|
|
3.2.6 Automatic Storage with Variable Size
|
------------------------------------------
|
|
The function ‘alloca’ supports a kind of half-dynamic allocation in
|
which blocks are allocated dynamically but freed automatically.
|
|
Allocating a block with ‘alloca’ is an explicit action; you can
|
allocate as many blocks as you wish, and compute the size at run time.
|
But all the blocks are freed when you exit the function that ‘alloca’
|
was called from, just as if they were automatic variables declared in
|
that function. There is no way to free the space explicitly.
|
|
The prototype for ‘alloca’ is in ‘stdlib.h’. This function is a BSD
|
extension.
|
|
-- Function: void * alloca (size_t SIZE)
|
Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety
|
Concepts::.
|
|
The return value of ‘alloca’ is the address of a block of SIZE
|
bytes of memory, allocated in the stack frame of the calling
|
function.
|
|
Do not use ‘alloca’ inside the arguments of a function call—you will
|
get unpredictable results, because the stack space for the ‘alloca’
|
would appear on the stack in the middle of the space for the function
|
arguments. An example of what to avoid is ‘foo (x, alloca (4), y)’.
|
|
* Menu:
|
|
* Alloca Example:: Example of using ‘alloca’.
|
* Advantages of Alloca:: Reasons to use ‘alloca’.
|
* Disadvantages of Alloca:: Reasons to avoid ‘alloca’.
|
* GNU C Variable-Size Arrays:: Only in GNU C, here is an alternative
|
method of allocating dynamically and
|
freeing automatically.
|
|
|
File: libc.info, Node: Alloca Example, Next: Advantages of Alloca, Up: Variable Size Automatic
|
|
3.2.6.1 ‘alloca’ Example
|
........................
|
|
As an example of the use of ‘alloca’, here is a function that opens a
|
file name made from concatenating two argument strings, and returns a
|
file descriptor or minus one signifying failure:
|
|
int
|
open2 (char *str1, char *str2, int flags, int mode)
|
{
|
char *name = (char *) alloca (strlen (str1) + strlen (str2) + 1);
|
stpcpy (stpcpy (name, str1), str2);
|
return open (name, flags, mode);
|
}
|
|
Here is how you would get the same results with ‘malloc’ and ‘free’:
|
|
int
|
open2 (char *str1, char *str2, int flags, int mode)
|
{
|
char *name = (char *) malloc (strlen (str1) + strlen (str2) + 1);
|
int desc;
|
if (name == 0)
|
fatal ("virtual memory exceeded");
|
stpcpy (stpcpy (name, str1), str2);
|
desc = open (name, flags, mode);
|
free (name);
|
return desc;
|
}
|
|
As you can see, it is simpler with ‘alloca’. But ‘alloca’ has other,
|
more important advantages, and some disadvantages.
|
|
|
File: libc.info, Node: Advantages of Alloca, Next: Disadvantages of Alloca, Prev: Alloca Example, Up: Variable Size Automatic
|
|
3.2.6.2 Advantages of ‘alloca’
|
..............................
|
|
Here are the reasons why ‘alloca’ may be preferable to ‘malloc’:
|
|
• Using ‘alloca’ wastes very little space and is very fast. (It is
|
open-coded by the GNU C compiler.)
|
|
• Since ‘alloca’ does not have separate pools for different sizes of
|
blocks, space used for any size block can be reused for any other
|
size. ‘alloca’ does not cause memory fragmentation.
|
|
• Nonlocal exits done with ‘longjmp’ (*note Non-Local Exits::)
|
automatically free the space allocated with ‘alloca’ when they exit
|
through the function that called ‘alloca’. This is the most
|
important reason to use ‘alloca’.
|
|
To illustrate this, suppose you have a function
|
‘open_or_report_error’ which returns a descriptor, like ‘open’, if
|
it succeeds, but does not return to its caller if it fails. If the
|
file cannot be opened, it prints an error message and jumps out to
|
the command level of your program using ‘longjmp’. Let’s change
|
‘open2’ (*note Alloca Example::) to use this subroutine:
|
|
int
|
open2 (char *str1, char *str2, int flags, int mode)
|
{
|
char *name = (char *) alloca (strlen (str1) + strlen (str2) + 1);
|
stpcpy (stpcpy (name, str1), str2);
|
return open_or_report_error (name, flags, mode);
|
}
|
|
Because of the way ‘alloca’ works, the memory it allocates is freed
|
even when an error occurs, with no special effort required.
|
|
By contrast, the previous definition of ‘open2’ (which uses
|
‘malloc’ and ‘free’) would develop a memory leak if it were changed
|
in this way. Even if you are willing to make more changes to fix
|
it, there is no easy way to do so.
|
|
|
File: libc.info, Node: Disadvantages of Alloca, Next: GNU C Variable-Size Arrays, Prev: Advantages of Alloca, Up: Variable Size Automatic
|
|
3.2.6.3 Disadvantages of ‘alloca’
|
.................................
|
|
These are the disadvantages of ‘alloca’ in comparison with ‘malloc’:
|
|
• If you try to allocate more memory than the machine can provide,
|
you don’t get a clean error message. Instead you get a fatal
|
signal like the one you would get from an infinite recursion;
|
probably a segmentation violation (*note Program Error Signals::).
|
|
• Some non-GNU systems fail to support ‘alloca’, so it is less
|
portable. However, a slower emulation of ‘alloca’ written in C is
|
available for use on systems with this deficiency.
|
|
|
File: libc.info, Node: GNU C Variable-Size Arrays, Prev: Disadvantages of Alloca, Up: Variable Size Automatic
|
|
3.2.6.4 GNU C Variable-Size Arrays
|
..................................
|
|
In GNU C, you can replace most uses of ‘alloca’ with an array of
|
variable size. Here is how ‘open2’ would look then:
|
|
int open2 (char *str1, char *str2, int flags, int mode)
|
{
|
char name[strlen (str1) + strlen (str2) + 1];
|
stpcpy (stpcpy (name, str1), str2);
|
return open (name, flags, mode);
|
}
|
|
But ‘alloca’ is not always equivalent to a variable-sized array, for
|
several reasons:
|
|
• A variable size array’s space is freed at the end of the scope of
|
the name of the array. The space allocated with ‘alloca’ remains
|
until the end of the function.
|
|
• It is possible to use ‘alloca’ within a loop, allocating an
|
additional block on each iteration. This is impossible with
|
variable-sized arrays.
|
|
*NB:* If you mix use of ‘alloca’ and variable-sized arrays within one
|
function, exiting a scope in which a variable-sized array was declared
|
frees all blocks allocated with ‘alloca’ during the execution of that
|
scope.
|
|
|
File: libc.info, Node: Resizing the Data Segment, Next: Locking Pages, Prev: Memory Allocation, Up: Memory
|
|
3.3 Resizing the Data Segment
|
=============================
|
|
The symbols in this section are declared in ‘unistd.h’.
|
|
You will not normally use the functions in this section, because the
|
functions described in *note Memory Allocation:: are easier to use.
|
Those are interfaces to a GNU C Library memory allocator that uses the
|
functions below itself. The functions below are simple interfaces to
|
system calls.
|
|
-- Function: int brk (void *ADDR)
|
Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety
|
Concepts::.
|
|
‘brk’ sets the high end of the calling process’ data segment to
|
ADDR.
|
|
The address of the end of a segment is defined to be the address of
|
the last byte in the segment plus 1.
|
|
The function has no effect if ADDR is lower than the low end of the
|
data segment. (This is considered success, by the way.)
|
|
The function fails if it would cause the data segment to overlap
|
another segment or exceed the process’ data storage limit (*note
|
Limits on Resources::).
|
|
The function is named for a common historical case where data
|
storage and the stack are in the same segment. Data storage
|
allocation grows upward from the bottom of the segment while the
|
stack grows downward toward it from the top of the segment and the
|
curtain between them is called the "break".
|
|
The return value is zero on success. On failure, the return value
|
is ‘-1’ and ‘errno’ is set accordingly. The following ‘errno’
|
values are specific to this function:
|
|
‘ENOMEM’
|
The request would cause the data segment to overlap another
|
segment or exceed the process’ data storage limit.
|
|
-- Function: void *sbrk (ptrdiff_t DELTA)
|
Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety
|
Concepts::.
|
|
This function is the same as ‘brk’ except that you specify the new
|
end of the data segment as an offset DELTA from the current end and
|
on success the return value is the address of the resulting end of
|
the data segment instead of zero.
|
|
This means you can use ‘sbrk(0)’ to find out what the current end
|
of the data segment is.
|
|
|
File: libc.info, Node: Locking Pages, Prev: Resizing the Data Segment, Up: Memory
|
|
3.4 Locking Pages
|
=================
|
|
You can tell the system to associate a particular virtual memory page
|
with a real page frame and keep it that way — i.e., cause the page to be
|
paged in if it isn’t already and mark it so it will never be paged out
|
and consequently will never cause a page fault. This is called
|
"locking" a page.
|
|
The functions in this chapter lock and unlock the calling process’
|
pages.
|
|
* Menu:
|
|
* Why Lock Pages:: Reasons to read this section.
|
* Locked Memory Details:: Everything you need to know locked
|
memory
|
* Page Lock Functions:: Here’s how to do it.
|
|
|
File: libc.info, Node: Why Lock Pages, Next: Locked Memory Details, Up: Locking Pages
|
|
3.4.1 Why Lock Pages
|
--------------------
|
|
Because page faults cause paged out pages to be paged in transparently,
|
a process rarely needs to be concerned about locking pages. However,
|
there are two reasons people sometimes are:
|
|
• Speed. A page fault is transparent only insofar as the process is
|
not sensitive to how long it takes to do a simple memory access.
|
Time-critical processes, especially realtime processes, may not be
|
able to wait or may not be able to tolerate variance in execution
|
speed.
|
|
A process that needs to lock pages for this reason probably also
|
needs priority among other processes for use of the CPU. *Note
|
Priority::.
|
|
In some cases, the programmer knows better than the system’s demand
|
paging allocator which pages should remain in real memory to
|
optimize system performance. In this case, locking pages can help.
|
|
• Privacy. If you keep secrets in virtual memory and that virtual
|
memory gets paged out, that increases the chance that the secrets
|
will get out. If a password gets written out to disk swap space,
|
for example, it might still be there long after virtual and real
|
memory have been wiped clean.
|
|
Be aware that when you lock a page, that’s one fewer page frame that
|
can be used to back other virtual memory (by the same or other
|
processes), which can mean more page faults, which means the system runs
|
more slowly. In fact, if you lock enough memory, some programs may not
|
be able to run at all for lack of real memory.
|
|
|
File: libc.info, Node: Locked Memory Details, Next: Page Lock Functions, Prev: Why Lock Pages, Up: Locking Pages
|
|
3.4.2 Locked Memory Details
|
---------------------------
|
|
A memory lock is associated with a virtual page, not a real frame. The
|
paging rule is: If a frame backs at least one locked page, don’t page it
|
out.
|
|
Memory locks do not stack. I.e., you can’t lock a particular page
|
twice so that it has to be unlocked twice before it is truly unlocked.
|
It is either locked or it isn’t.
|
|
A memory lock persists until the process that owns the memory
|
explicitly unlocks it. (But process termination and exec cause the
|
virtual memory to cease to exist, which you might say means it isn’t
|
locked any more).
|
|
Memory locks are not inherited by child processes. (But note that on
|
a modern Unix system, immediately after a fork, the parent’s and the
|
child’s virtual address space are backed by the same real page frames,
|
so the child enjoys the parent’s locks). *Note Creating a Process::.
|
|
Because of its ability to impact other processes, only the superuser
|
can lock a page. Any process can unlock its own page.
|
|
The system sets limits on the amount of memory a process can have
|
locked and the amount of real memory it can have dedicated to it. *Note
|
Limits on Resources::.
|
|
In Linux, locked pages aren’t as locked as you might think. Two
|
virtual pages that are not shared memory can nonetheless be backed by
|
the same real frame. The kernel does this in the name of efficiency
|
when it knows both virtual pages contain identical data, and does it
|
even if one or both of the virtual pages are locked.
|
|
But when a process modifies one of those pages, the kernel must get
|
it a separate frame and fill it with the page’s data. This is known as
|
a "copy-on-write page fault". It takes a small amount of time and in a
|
pathological case, getting that frame may require I/O.
|
|
To make sure this doesn’t happen to your program, don’t just lock the
|
pages. Write to them as well, unless you know you won’t write to them
|
ever. And to make sure you have pre-allocated frames for your stack,
|
enter a scope that declares a C automatic variable larger than the
|
maximum stack size you will need, set it to something, then return from
|
its scope.
|
|
|
File: libc.info, Node: Page Lock Functions, Prev: Locked Memory Details, Up: Locking Pages
|
|
3.4.3 Functions To Lock And Unlock Pages
|
----------------------------------------
|
|
The symbols in this section are declared in ‘sys/mman.h’. These
|
functions are defined by POSIX.1b, but their availability depends on
|
your kernel. If your kernel doesn’t allow these functions, they exist
|
but always fail. They _are_ available with a Linux kernel.
|
|
*Portability Note:* POSIX.1b requires that when the ‘mlock’ and
|
‘munlock’ functions are available, the file ‘unistd.h’ define the macro
|
‘_POSIX_MEMLOCK_RANGE’ and the file ‘limits.h’ define the macro
|
‘PAGESIZE’ to be the size of a memory page in bytes. It requires that
|
when the ‘mlockall’ and ‘munlockall’ functions are available, the
|
‘unistd.h’ file define the macro ‘_POSIX_MEMLOCK’. The GNU C Library
|
conforms to this requirement.
|
|
-- Function: int mlock (const void *ADDR, size_t LEN)
|
Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety
|
Concepts::.
|
|
‘mlock’ locks a range of the calling process’ virtual pages.
|
|
The range of memory starts at address ADDR and is LEN bytes long.
|
Actually, since you must lock whole pages, it is the range of pages
|
that include any part of the specified range.
|
|
When the function returns successfully, each of those pages is
|
backed by (connected to) a real frame (is resident) and is marked
|
to stay that way. This means the function may cause page-ins and
|
have to wait for them.
|
|
When the function fails, it does not affect the lock status of any
|
pages.
|
|
The return value is zero if the function succeeds. Otherwise, it
|
is ‘-1’ and ‘errno’ is set accordingly. ‘errno’ values specific to
|
this function are:
|
|
‘ENOMEM’
|
• At least some of the specified address range does not
|
exist in the calling process’ virtual address space.
|
• The locking would cause the process to exceed its locked
|
page limit.
|
|
‘EPERM’
|
The calling process is not superuser.
|
|
‘EINVAL’
|
LEN is not positive.
|
|
‘ENOSYS’
|
The kernel does not provide ‘mlock’ capability.
|
|
You can lock _all_ a process’ memory with ‘mlockall’. You unlock
|
memory with ‘munlock’ or ‘munlockall’.
|
|
To avoid all page faults in a C program, you have to use
|
‘mlockall’, because some of the memory a program uses is hidden
|
from the C code, e.g. the stack and automatic variables, and you
|
wouldn’t know what address to tell ‘mlock’.
|
|
-- Function: int munlock (const void *ADDR, size_t LEN)
|
Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety
|
Concepts::.
|
|
‘munlock’ unlocks a range of the calling process’ virtual pages.
|
|
‘munlock’ is the inverse of ‘mlock’ and functions completely
|
analogously to ‘mlock’, except that there is no ‘EPERM’ failure.
|
|
-- Function: int mlockall (int FLAGS)
|
Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety
|
Concepts::.
|
|
‘mlockall’ locks all the pages in a process’ virtual memory address
|
space, and/or any that are added to it in the future. This
|
includes the pages of the code, data and stack segment, as well as
|
shared libraries, user space kernel data, shared memory, and memory
|
mapped files.
|
|
FLAGS is a string of single bit flags represented by the following
|
macros. They tell ‘mlockall’ which of its functions you want. All
|
other bits must be zero.
|
|
‘MCL_CURRENT’
|
Lock all pages which currently exist in the calling process’
|
virtual address space.
|
|
‘MCL_FUTURE’
|
Set a mode such that any pages added to the process’ virtual
|
address space in the future will be locked from birth. This
|
mode does not affect future address spaces owned by the same
|
process so exec, which replaces a process’ address space,
|
wipes out ‘MCL_FUTURE’. *Note Executing a File::.
|
|
When the function returns successfully, and you specified
|
‘MCL_CURRENT’, all of the process’ pages are backed by (connected
|
to) real frames (they are resident) and are marked to stay that
|
way. This means the function may cause page-ins and have to wait
|
for them.
|
|
When the process is in ‘MCL_FUTURE’ mode because it successfully
|
executed this function and specified ‘MCL_CURRENT’, any system call
|
by the process that requires space be added to its virtual address
|
space fails with ‘errno’ = ‘ENOMEM’ if locking the additional space
|
would cause the process to exceed its locked page limit. In the
|
case that the address space addition that can’t be accommodated is
|
stack expansion, the stack expansion fails and the kernel sends a
|
‘SIGSEGV’ signal to the process.
|
|
When the function fails, it does not affect the lock status of any
|
pages or the future locking mode.
|
|
The return value is zero if the function succeeds. Otherwise, it
|
is ‘-1’ and ‘errno’ is set accordingly. ‘errno’ values specific to
|
this function are:
|
|
‘ENOMEM’
|
• At least some of the specified address range does not
|
exist in the calling process’ virtual address space.
|
• The locking would cause the process to exceed its locked
|
page limit.
|
|
‘EPERM’
|
The calling process is not superuser.
|
|
‘EINVAL’
|
Undefined bits in FLAGS are not zero.
|
|
‘ENOSYS’
|
The kernel does not provide ‘mlockall’ capability.
|
|
You can lock just specific pages with ‘mlock’. You unlock pages
|
with ‘munlockall’ and ‘munlock’.
|
|
-- Function: int munlockall (void)
|
Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety
|
Concepts::.
|
|
‘munlockall’ unlocks every page in the calling process’ virtual
|
address space and turns off ‘MCL_FUTURE’ future locking mode.
|
|
The return value is zero if the function succeeds. Otherwise, it
|
is ‘-1’ and ‘errno’ is set accordingly. The only way this function
|
can fail is for generic reasons that all functions and system calls
|
can fail, so there are no specific ‘errno’ values.
|
|
|
File: libc.info, Node: Character Handling, Next: String and Array Utilities, Prev: Memory, Up: Top
|
|
4 Character Handling
|
********************
|
|
Programs that work with characters and strings often need to classify a
|
character—is it alphabetic, is it a digit, is it whitespace, and so
|
on—and perform case conversion operations on characters. The functions
|
in the header file ‘ctype.h’ are provided for this purpose.
|
|
Since the choice of locale and character set can alter the
|
classifications of particular character codes, all of these functions
|
are affected by the current locale. (More precisely, they are affected
|
by the locale currently selected for character classification—the
|
‘LC_CTYPE’ category; see *note Locale Categories::.)
|
|
The ISO C standard specifies two different sets of functions. The
|
one set works on ‘char’ type characters, the other one on ‘wchar_t’ wide
|
characters (*note Extended Char Intro::).
|
|
* Menu:
|
|
* Classification of Characters:: Testing whether characters are
|
letters, digits, punctuation, etc.
|
|
* Case Conversion:: Case mapping, and the like.
|
* Classification of Wide Characters:: Character class determination for
|
wide characters.
|
* Using Wide Char Classes:: Notes on using the wide character
|
classes.
|
* Wide Character Case Conversion:: Mapping of wide characters.
|
|
|
File: libc.info, Node: Classification of Characters, Next: Case Conversion, Up: Character Handling
|
|
4.1 Classification of Characters
|
================================
|
|
This section explains the library functions for classifying characters.
|
For example, ‘isalpha’ is the function to test for an alphabetic
|
character. It takes one argument, the character to test, and returns a
|
nonzero integer if the character is alphabetic, and zero otherwise. You
|
would use it like this:
|
|
if (isalpha (c))
|
printf ("The character `%c' is alphabetic.\n", c);
|
|
Each of the functions in this section tests for membership in a
|
particular class of characters; each has a name starting with ‘is’.
|
Each of them takes one argument, which is a character to test, and
|
returns an ‘int’ which is treated as a boolean value. The character
|
argument is passed as an ‘int’, and it may be the constant value ‘EOF’
|
instead of a real character.
|
|
The attributes of any given character can vary between locales.
|
*Note Locales::, for more information on locales.
|
|
These functions are declared in the header file ‘ctype.h’.
|
|
-- Function: int islower (int C)
|
Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety
|
Concepts::.
|
|
Returns true if C is a lower-case letter. The letter need not be
|
from the Latin alphabet, any alphabet representable is valid.
|
|
-- Function: int isupper (int C)
|
Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety
|
Concepts::.
|
|
Returns true if C is an upper-case letter. The letter need not be
|
from the Latin alphabet, any alphabet representable is valid.
|
|
-- Function: int isalpha (int C)
|
Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety
|
Concepts::.
|
|
Returns true if C is an alphabetic character (a letter). If
|
‘islower’ or ‘isupper’ is true of a character, then ‘isalpha’ is
|
also true.
|
|
In some locales, there may be additional characters for which
|
‘isalpha’ is true—letters which are neither upper case nor lower
|
case. But in the standard ‘"C"’ locale, there are no such
|
additional characters.
|
|
-- Function: int isdigit (int C)
|
Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety
|
Concepts::.
|
|
Returns true if C is a decimal digit (‘0’ through ‘9’).
|
|
-- Function: int isalnum (int C)
|
Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety
|
Concepts::.
|
|
Returns true if C is an alphanumeric character (a letter or
|
number); in other words, if either ‘isalpha’ or ‘isdigit’ is true
|
of a character, then ‘isalnum’ is also true.
|
|
-- Function: int isxdigit (int C)
|
Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety
|
Concepts::.
|
|
Returns true if C is a hexadecimal digit. Hexadecimal digits
|
include the normal decimal digits ‘0’ through ‘9’ and the letters
|
‘A’ through ‘F’ and ‘a’ through ‘f’.
|
|
-- Function: int ispunct (int C)
|
Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety
|
Concepts::.
|
|
Returns true if C is a punctuation character. This means any
|
printing character that is not alphanumeric or a space character.
|
|
-- Function: int isspace (int C)
|
Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety
|
Concepts::.
|
|
Returns true if C is a "whitespace" character. In the standard
|
‘"C"’ locale, ‘isspace’ returns true for only the standard
|
whitespace characters:
|
|
‘' '’
|
space
|
|
‘'\f'’
|
formfeed
|
|
‘'\n'’
|
newline
|
|
‘'\r'’
|
carriage return
|
|
‘'\t'’
|
horizontal tab
|
|
‘'\v'’
|
vertical tab
|
|
-- Function: int isblank (int C)
|
Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety
|
Concepts::.
|
|
Returns true if C is a blank character; that is, a space or a tab.
|
This function was originally a GNU extension, but was added in
|
ISO C99.
|
|
-- Function: int isgraph (int C)
|
Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety
|
Concepts::.
|
|
Returns true if C is a graphic character; that is, a character that
|
has a glyph associated with it. The whitespace characters are not
|
considered graphic.
|
|
-- Function: int isprint (int C)
|
Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety
|
Concepts::.
|
|
Returns true if C is a printing character. Printing characters
|
include all the graphic characters, plus the space (‘ ’) character.
|
|
-- Function: int iscntrl (int C)
|
Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety
|
Concepts::.
|
|
Returns true if C is a control character (that is, a character that
|
is not a printing character).
|
|
-- Function: int isascii (int C)
|
Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety
|
Concepts::.
|
|
Returns true if C is a 7-bit ‘unsigned char’ value that fits into
|
the US/UK ASCII character set. This function is a BSD extension
|
and is also an SVID extension.
|
|
|
File: libc.info, Node: Case Conversion, Next: Classification of Wide Characters, Prev: Classification of Characters, Up: Character Handling
|
|
4.2 Case Conversion
|
===================
|
|
This section explains the library functions for performing conversions
|
such as case mappings on characters. For example, ‘toupper’ converts
|
any character to upper case if possible. If the character can’t be
|
converted, ‘toupper’ returns it unchanged.
|
|
These functions take one argument of type ‘int’, which is the
|
character to convert, and return the converted character as an ‘int’.
|
If the conversion is not applicable to the argument given, the argument
|
is returned unchanged.
|
|
*Compatibility Note:* In pre-ISO C dialects, instead of returning the
|
argument unchanged, these functions may fail when the argument is not
|
suitable for the conversion. Thus for portability, you may need to
|
write ‘islower(c) ? toupper(c) : c’ rather than just ‘toupper(c)’.
|
|
These functions are declared in the header file ‘ctype.h’.
|
|
-- Function: int tolower (int C)
|
Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety
|
Concepts::.
|
|
If C is an upper-case letter, ‘tolower’ returns the corresponding
|
lower-case letter. If C is not an upper-case letter, C is returned
|
unchanged.
|
|
-- Function: int toupper (int C)
|
Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety
|
Concepts::.
|
|
If C is a lower-case letter, ‘toupper’ returns the corresponding
|
upper-case letter. Otherwise C is returned unchanged.
|
|
-- Function: int toascii (int C)
|
Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety
|
Concepts::.
|
|
This function converts C to a 7-bit ‘unsigned char’ value that fits
|
into the US/UK ASCII character set, by clearing the high-order
|
bits. This function is a BSD extension and is also an SVID
|
extension.
|
|
-- Function: int _tolower (int C)
|
Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety
|
Concepts::.
|
|
This is identical to ‘tolower’, and is provided for compatibility
|
with the SVID. *Note SVID::.
|
|
-- Function: int _toupper (int C)
|
Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety
|
Concepts::.
|
|
This is identical to ‘toupper’, and is provided for compatibility
|
with the SVID.
|
|
|
File: libc.info, Node: Classification of Wide Characters, Next: Using Wide Char Classes, Prev: Case Conversion, Up: Character Handling
|
|
4.3 Character class determination for wide characters
|
=====================================================
|
|
Amendment 1 to ISO C90 defines functions to classify wide characters.
|
Although the original ISO C90 standard already defined the type
|
‘wchar_t’, no functions operating on them were defined.
|
|
The general design of the classification functions for wide
|
characters is more general. It allows extensions to the set of
|
available classifications, beyond those which are always available. The
|
POSIX standard specifies how extensions can be made, and this is already
|
implemented in the GNU C Library implementation of the ‘localedef’
|
program.
|
|
The character class functions are normally implemented with bitsets,
|
with a bitset per character. For a given character, the appropriate
|
bitset is read from a table and a test is performed as to whether a
|
certain bit is set. Which bit is tested for is determined by the class.
|
|
For the wide character classification functions this is made visible.
|
There is a type classification type defined, a function to retrieve this
|
value for a given class, and a function to test whether a given
|
character is in this class, using the classification value. On top of
|
this the normal character classification functions as used for ‘char’
|
objects can be defined.
|
|
-- Data type: wctype_t
|
The ‘wctype_t’ can hold a value which represents a character class.
|
The only defined way to generate such a value is by using the
|
‘wctype’ function.
|
|
This type is defined in ‘wctype.h’.
|
|
-- Function: wctype_t wctype (const char *PROPERTY)
|
Preliminary: | MT-Safe locale | AS-Safe | AC-Safe | *Note POSIX
|
Safety Concepts::.
|
|
‘wctype’ returns a value representing a class of wide characters
|
which is identified by the string PROPERTY. Besides some standard
|
properties each locale can define its own ones. In case no
|
property with the given name is known for the current locale
|
selected for the ‘LC_CTYPE’ category, the function returns zero.
|
|
The properties known in every locale are:
|
|
‘"alnum"’ ‘"alpha"’ ‘"cntrl"’ ‘"digit"’
|
‘"graph"’ ‘"lower"’ ‘"print"’ ‘"punct"’
|
‘"space"’ ‘"upper"’ ‘"xdigit"’
|
|
This function is declared in ‘wctype.h’.
|
|
To test the membership of a character to one of the non-standard
|
classes the ISO C standard defines a completely new function.
|
|
-- Function: int iswctype (wint_t WC, wctype_t DESC)
|
Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety
|
Concepts::.
|
|
This function returns a nonzero value if WC is in the character
|
class specified by DESC. DESC must previously be returned by a
|
successful call to ‘wctype’.
|
|
This function is declared in ‘wctype.h’.
|
|
To make it easier to use the commonly-used classification functions,
|
they are defined in the C library. There is no need to use ‘wctype’ if
|
the property string is one of the known character classes. In some
|
situations it is desirable to construct the property strings, and then
|
it is important that ‘wctype’ can also handle the standard classes.
|
|
-- Function: int iswalnum (wint_t WC)
|
Preliminary: | MT-Safe locale | AS-Safe | AC-Safe | *Note POSIX
|
Safety Concepts::.
|
|
This function returns a nonzero value if WC is an alphanumeric
|
character (a letter or number); in other words, if either
|
‘iswalpha’ or ‘iswdigit’ is true of a character, then ‘iswalnum’ is
|
also true.
|
|
This function can be implemented using
|
|
iswctype (wc, wctype ("alnum"))
|
|
It is declared in ‘wctype.h’.
|
|
-- Function: int iswalpha (wint_t WC)
|
Preliminary: | MT-Safe locale | AS-Safe | AC-Safe | *Note POSIX
|
Safety Concepts::.
|
|
Returns true if WC is an alphabetic character (a letter). If
|
‘iswlower’ or ‘iswupper’ is true of a character, then ‘iswalpha’ is
|
also true.
|
|
In some locales, there may be additional characters for which
|
‘iswalpha’ is true—letters which are neither upper case nor lower
|
case. But in the standard ‘"C"’ locale, there are no such
|
additional characters.
|
|
This function can be implemented using
|
|
iswctype (wc, wctype ("alpha"))
|
|
It is declared in ‘wctype.h’.
|
|
-- Function: int iswcntrl (wint_t WC)
|
Preliminary: | MT-Safe locale | AS-Safe | AC-Safe | *Note POSIX
|
Safety Concepts::.
|
|
Returns true if WC is a control character (that is, a character
|
that is not a printing character).
|
|
This function can be implemented using
|
|
iswctype (wc, wctype ("cntrl"))
|
|
It is declared in ‘wctype.h’.
|
|
-- Function: int iswdigit (wint_t WC)
|
Preliminary: | MT-Safe locale | AS-Safe | AC-Safe | *Note POSIX
|
Safety Concepts::.
|
|
Returns true if WC is a digit (e.g., ‘0’ through ‘9’). Please note
|
that this function does not only return a nonzero value for
|
_decimal_ digits, but for all kinds of digits. A consequence is
|
that code like the following will *not* work unconditionally for
|
wide characters:
|
|
n = 0;
|
while (iswdigit (*wc))
|
{
|
n *= 10;
|
n += *wc++ - L'0';
|
}
|
|
This function can be implemented using
|
|
iswctype (wc, wctype ("digit"))
|
|
It is declared in ‘wctype.h’.
|
|
-- Function: int iswgraph (wint_t WC)
|
Preliminary: | MT-Safe locale | AS-Safe | AC-Safe | *Note POSIX
|
Safety Concepts::.
|
|
Returns true if WC is a graphic character; that is, a character
|
that has a glyph associated with it. The whitespace characters are
|
not considered graphic.
|
|
This function can be implemented using
|
|
iswctype (wc, wctype ("graph"))
|
|
It is declared in ‘wctype.h’.
|
|
-- Function: int iswlower (wint_t WC)
|
Preliminary: | MT-Safe locale | AS-Safe | AC-Safe | *Note POSIX
|
Safety Concepts::.
|
|
Returns true if WC is a lower-case letter. The letter need not be
|
from the Latin alphabet, any alphabet representable is valid.
|
|
This function can be implemented using
|
|
iswctype (wc, wctype ("lower"))
|
|
It is declared in ‘wctype.h’.
|
|
-- Function: int iswprint (wint_t WC)
|
Preliminary: | MT-Safe locale | AS-Safe | AC-Safe | *Note POSIX
|
Safety Concepts::.
|
|
Returns true if WC is a printing character. Printing characters
|
include all the graphic characters, plus the space (‘ ’) character.
|
|
This function can be implemented using
|
|
iswctype (wc, wctype ("print"))
|
|
It is declared in ‘wctype.h’.
|
|
-- Function: int iswpunct (wint_t WC)
|
Preliminary: | MT-Safe locale | AS-Safe | AC-Safe | *Note POSIX
|
Safety Concepts::.
|
|
Returns true if WC is a punctuation character. This means any
|
printing character that is not alphanumeric or a space character.
|
|
This function can be implemented using
|
|
iswctype (wc, wctype ("punct"))
|
|
It is declared in ‘wctype.h’.
|
|
-- Function: int iswspace (wint_t WC)
|
Preliminary: | MT-Safe locale | AS-Safe | AC-Safe | *Note POSIX
|
Safety Concepts::.
|
|
Returns true if WC is a "whitespace" character. In the standard
|
‘"C"’ locale, ‘iswspace’ returns true for only the standard
|
whitespace characters:
|
|
‘L' '’
|
space
|
|
‘L'\f'’
|
formfeed
|
|
‘L'\n'’
|
newline
|
|
‘L'\r'’
|
carriage return
|
|
‘L'\t'’
|
horizontal tab
|
|
‘L'\v'’
|
vertical tab
|
|
This function can be implemented using
|
|
iswctype (wc, wctype ("space"))
|
|
It is declared in ‘wctype.h’.
|
|
-- Function: int iswupper (wint_t WC)
|
Preliminary: | MT-Safe locale | AS-Safe | AC-Safe | *Note POSIX
|
Safety Concepts::.
|
|
Returns true if WC is an upper-case letter. The letter need not be
|
from the Latin alphabet, any alphabet representable is valid.
|
|
This function can be implemented using
|
|
iswctype (wc, wctype ("upper"))
|
|
It is declared in ‘wctype.h’.
|
|
-- Function: int iswxdigit (wint_t WC)
|
Preliminary: | MT-Safe locale | AS-Safe | AC-Safe | *Note POSIX
|
Safety Concepts::.
|
|
Returns true if WC is a hexadecimal digit. Hexadecimal digits
|
include the normal decimal digits ‘0’ through ‘9’ and the letters
|
‘A’ through ‘F’ and ‘a’ through ‘f’.
|
|
This function can be implemented using
|
|
iswctype (wc, wctype ("xdigit"))
|
|
It is declared in ‘wctype.h’.
|
|
The GNU C Library also provides a function which is not defined in
|
the ISO C standard but which is available as a version for single byte
|
characters as well.
|
|
-- Function: int iswblank (wint_t WC)
|
Preliminary: | MT-Safe locale | AS-Safe | AC-Safe | *Note POSIX
|
Safety Concepts::.
|
|
Returns true if WC is a blank character; that is, a space or a tab.
|
This function was originally a GNU extension, but was added in
|
ISO C99. It is declared in ‘wchar.h’.
|
|
|
File: libc.info, Node: Using Wide Char Classes, Next: Wide Character Case Conversion, Prev: Classification of Wide Characters, Up: Character Handling
|
|
4.4 Notes on using the wide character classes
|
=============================================
|
|
The first note is probably not astonishing but still occasionally a
|
cause of problems. The ‘iswXXX’ functions can be implemented using
|
macros and in fact, the GNU C Library does this. They are still
|
available as real functions but when the ‘wctype.h’ header is included
|
the macros will be used. This is the same as the ‘char’ type versions
|
of these functions.
|
|
The second note covers something new. It can be best illustrated by
|
a (real-world) example. The first piece of code is an excerpt from the
|
original code. It is truncated a bit but the intention should be clear.
|
|
int
|
is_in_class (int c, const char *class)
|
{
|
if (strcmp (class, "alnum") == 0)
|
return isalnum (c);
|
if (strcmp (class, "alpha") == 0)
|
return isalpha (c);
|
if (strcmp (class, "cntrl") == 0)
|
return iscntrl (c);
|
…
|
return 0;
|
}
|
|
Now, with the ‘wctype’ and ‘iswctype’ you can avoid the ‘if’
|
cascades, but rewriting the code as follows is wrong:
|
|
int
|
is_in_class (int c, const char *class)
|
{
|
wctype_t desc = wctype (class);
|
return desc ? iswctype ((wint_t) c, desc) : 0;
|
}
|
|
The problem is that it is not guaranteed that the wide character
|
representation of a single-byte character can be found using casting.
|
In fact, usually this fails miserably. The correct solution to this
|
problem is to write the code as follows:
|
|
int
|
is_in_class (int c, const char *class)
|
{
|
wctype_t desc = wctype (class);
|
return desc ? iswctype (btowc (c), desc) : 0;
|
}
|
|
*Note Converting a Character::, for more information on ‘btowc’.
|
Note that this change probably does not improve the performance of the
|
program a lot since the ‘wctype’ function still has to make the string
|
comparisons. It gets really interesting if the ‘is_in_class’ function
|
is called more than once for the same class name. In this case the
|
variable DESC could be computed once and reused for all the calls.
|
Therefore the above form of the function is probably not the final one.
|
|
|
File: libc.info, Node: Wide Character Case Conversion, Prev: Using Wide Char Classes, Up: Character Handling
|
|
4.5 Mapping of wide characters.
|
===============================
|
|
The classification functions are also generalized by the ISO C standard.
|
Instead of just allowing the two standard mappings, a locale can contain
|
others. Again, the ‘localedef’ program already supports generating such
|
locale data files.
|
|
-- Data Type: wctrans_t
|
This data type is defined as a scalar type which can hold a value
|
representing the locale-dependent character mapping. There is no
|
way to construct such a value apart from using the return value of
|
the ‘wctrans’ function.
|
|
This type is defined in ‘wctype.h’.
|
|
-- Function: wctrans_t wctrans (const char *PROPERTY)
|
Preliminary: | MT-Safe locale | AS-Safe | AC-Safe | *Note POSIX
|
Safety Concepts::.
|
|
The ‘wctrans’ function has to be used to find out whether a named
|
mapping is defined in the current locale selected for the
|
‘LC_CTYPE’ category. If the returned value is non-zero, you can
|
use it afterwards in calls to ‘towctrans’. If the return value is
|
zero no such mapping is known in the current locale.
|
|
Beside locale-specific mappings there are two mappings which are
|
guaranteed to be available in every locale:
|
|
‘"tolower"’ ‘"toupper"’
|
|
These functions are declared in ‘wctype.h’.
|
|
-- Function: wint_t towctrans (wint_t WC, wctrans_t DESC)
|
Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety
|
Concepts::.
|
|
‘towctrans’ maps the input character WC according to the rules of
|
the mapping for which DESC is a descriptor, and returns the value
|
it finds. DESC must be obtained by a successful call to ‘wctrans’.
|
|
This function is declared in ‘wctype.h’.
|
|
For the generally available mappings, the ISO C standard defines
|
convenient shortcuts so that it is not necessary to call ‘wctrans’ for
|
them.
|
|
-- Function: wint_t towlower (wint_t WC)
|
Preliminary: | MT-Safe locale | AS-Safe | AC-Safe | *Note POSIX
|
Safety Concepts::.
|
|
If WC is an upper-case letter, ‘towlower’ returns the corresponding
|
lower-case letter. If WC is not an upper-case letter, WC is
|
returned unchanged.
|
|
‘towlower’ can be implemented using
|
|
towctrans (wc, wctrans ("tolower"))
|
|
This function is declared in ‘wctype.h’.
|
|
-- Function: wint_t towupper (wint_t WC)
|
Preliminary: | MT-Safe locale | AS-Safe | AC-Safe | *Note POSIX
|
Safety Concepts::.
|
|
If WC is a lower-case letter, ‘towupper’ returns the corresponding
|
upper-case letter. Otherwise WC is returned unchanged.
|
|
‘towupper’ can be implemented using
|
|
towctrans (wc, wctrans ("toupper"))
|
|
This function is declared in ‘wctype.h’.
|
|
The same warnings given in the last section for the use of the wide
|
character classification functions apply here. It is not possible to
|
simply cast a ‘char’ type value to a ‘wint_t’ and use it as an argument
|
to ‘towctrans’ calls.
|
|
|
File: libc.info, Node: String and Array Utilities, Next: Character Set Handling, Prev: Character Handling, Up: Top
|
|
5 String and Array Utilities
|
****************************
|
|
Operations on strings (null-terminated byte sequences) are an important
|
part of many programs. The GNU C Library provides an extensive set of
|
string utility functions, including functions for copying,
|
concatenating, comparing, and searching strings. Many of these
|
functions can also operate on arbitrary regions of storage; for example,
|
the ‘memcpy’ function can be used to copy the contents of any kind of
|
array.
|
|
It’s fairly common for beginning C programmers to “reinvent the
|
wheel” by duplicating this functionality in their own code, but it pays
|
to become familiar with the library functions and to make use of them,
|
since this offers benefits in maintenance, efficiency, and portability.
|
|
For instance, you could easily compare one string to another in two
|
lines of C code, but if you use the built-in ‘strcmp’ function, you’re
|
less likely to make a mistake. And, since these library functions are
|
typically highly optimized, your program may run faster too.
|
|
* Menu:
|
|
* Representation of Strings:: Introduction to basic concepts.
|
* String/Array Conventions:: Whether to use a string function or an
|
arbitrary array function.
|
* String Length:: Determining the length of a string.
|
* Copying Strings and Arrays:: Functions to copy strings and arrays.
|
* Concatenating Strings:: Functions to concatenate strings while copying.
|
* Truncating Strings:: Functions to truncate strings while copying.
|
* String/Array Comparison:: Functions for byte-wise and character-wise
|
comparison.
|
* Collation Functions:: Functions for collating strings.
|
* Search Functions:: Searching for a specific element or substring.
|
* Finding Tokens in a String:: Splitting a string into tokens by looking
|
for delimiters.
|
* Erasing Sensitive Data:: Clearing memory which contains sensitive
|
data, after it’s no longer needed.
|
* strfry:: Function for flash-cooking a string.
|
* Trivial Encryption:: Obscuring data.
|
* Encode Binary Data:: Encoding and Decoding of Binary Data.
|
* Argz and Envz Vectors:: Null-separated string vectors.
|
|
|
File: libc.info, Node: Representation of Strings, Next: String/Array Conventions, Up: String and Array Utilities
|
|
5.1 Representation of Strings
|
=============================
|
|
This section is a quick summary of string concepts for beginning C
|
programmers. It describes how strings are represented in C and some
|
common pitfalls. If you are already familiar with this material, you
|
can skip this section.
|
|
A "string" is a null-terminated array of bytes of type ‘char’,
|
including the terminating null byte. String-valued variables are
|
usually declared to be pointers of type ‘char *’. Such variables do not
|
include space for the text of a string; that has to be stored somewhere
|
else—in an array variable, a string constant, or dynamically allocated
|
memory (*note Memory Allocation::). It’s up to you to store the address
|
of the chosen memory space into the pointer variable. Alternatively you
|
can store a "null pointer" in the pointer variable. The null pointer
|
does not point anywhere, so attempting to reference the string it points
|
to gets an error.
|
|
A "multibyte character" is a sequence of one or more bytes that
|
represents a single character using the locale’s encoding scheme; a null
|
byte always represents the null character. A "multibyte string" is a
|
string that consists entirely of multibyte characters. In contrast, a
|
"wide string" is a null-terminated sequence of ‘wchar_t’ objects. A
|
wide-string variable is usually declared to be a pointer of type
|
‘wchar_t *’, by analogy with string variables and ‘char *’. *Note
|
Extended Char Intro::.
|
|
By convention, the "null byte", ‘'\0'’, marks the end of a string and
|
the "null wide character", ‘L'\0'’, marks the end of a wide string. For
|
example, in testing to see whether the ‘char *’ variable P points to a
|
null byte marking the end of a string, you can write ‘!*P’ or ‘*P ==
|
'\0'’.
|
|
A null byte is quite different conceptually from a null pointer,
|
although both are represented by the integer constant ‘0’.
|
|
A "string literal" appears in C program source as a multibyte string
|
between double-quote characters (‘"’). If the initial double-quote
|
character is immediately preceded by a capital ‘L’ (ell) character (as
|
in ‘L"foo"’), it is a wide string literal. String literals can also
|
contribute to "string concatenation": ‘"a" "b"’ is the same as ‘"ab"’.
|
For wide strings one can use either ‘L"a" L"b"’ or ‘L"a" "b"’.
|
Modification of string literals is not allowed by the GNU C compiler,
|
because literals are placed in read-only storage.
|
|
Arrays that are declared ‘const’ cannot be modified either. It’s
|
generally good style to declare non-modifiable string pointers to be of
|
type ‘const char *’, since this often allows the C compiler to detect
|
accidental modifications as well as providing some amount of
|
documentation about what your program intends to do with the string.
|
|
The amount of memory allocated for a byte array may extend past the
|
null byte that marks the end of the string that the array contains. In
|
this document, the term "allocated size" is always used to refer to the
|
total amount of memory allocated for an array, while the term "length"
|
refers to the number of bytes up to (but not including) the terminating
|
null byte. Wide strings are similar, except their sizes and lengths
|
count wide characters, not bytes.
|
|
A notorious source of program bugs is trying to put more bytes into a
|
string than fit in its allocated size. When writing code that extends
|
strings or moves bytes into a pre-allocated array, you should be very
|
careful to keep track of the length of the text and make explicit checks
|
for overflowing the array. Many of the library functions _do not_ do
|
this for you! Remember also that you need to allocate an extra byte to
|
hold the null byte that marks the end of the string.
|
|
Originally strings were sequences of bytes where each byte
|
represented a single character. This is still true today if the strings
|
are encoded using a single-byte character encoding. Things are
|
different if the strings are encoded using a multibyte encoding (for
|
more information on encodings see *note Extended Char Intro::). There
|
is no difference in the programming interface for these two kind of
|
strings; the programmer has to be aware of this and interpret the byte
|
sequences accordingly.
|
|
But since there is no separate interface taking care of these
|
differences the byte-based string functions are sometimes hard to use.
|
Since the count parameters of these functions specify bytes a call to
|
‘memcpy’ could cut a multibyte character in the middle and put an
|
incomplete (and therefore unusable) byte sequence in the target buffer.
|
|
To avoid these problems later versions of the ISO C standard
|
introduce a second set of functions which are operating on "wide
|
characters" (*note Extended Char Intro::). These functions don’t have
|
the problems the single-byte versions have since every wide character is
|
a legal, interpretable value. This does not mean that cutting wide
|
strings at arbitrary points is without problems. It normally is for
|
alphabet-based languages (except for non-normalized text) but languages
|
based on syllables still have the problem that more than one wide
|
character is necessary to complete a logical unit. This is a higher
|
level problem which the C library functions are not designed to solve.
|
But it is at least good that no invalid byte sequences can be created.
|
Also, the higher level functions can also much more easily operate on
|
wide characters than on multibyte characters so that a common strategy
|
is to use wide characters internally whenever text is more than simply
|
copied.
|
|
The remaining of this chapter will discuss the functions for handling
|
wide strings in parallel with the discussion of strings since there is
|
almost always an exact equivalent available.
|
|
|
File: libc.info, Node: String/Array Conventions, Next: String Length, Prev: Representation of Strings, Up: String and Array Utilities
|
|
5.2 String and Array Conventions
|
================================
|
|
This chapter describes both functions that work on arbitrary arrays or
|
blocks of memory, and functions that are specific to strings and wide
|
strings.
|
|
Functions that operate on arbitrary blocks of memory have names
|
beginning with ‘mem’ and ‘wmem’ (such as ‘memcpy’ and ‘wmemcpy’) and
|
invariably take an argument which specifies the size (in bytes and wide
|
characters respectively) of the block of memory to operate on. The
|
array arguments and return values for these functions have type ‘void *’
|
or ‘wchar_t’. As a matter of style, the elements of the arrays used
|
with the ‘mem’ functions are referred to as “bytes”. You can pass any
|
kind of pointer to these functions, and the ‘sizeof’ operator is useful
|
in computing the value for the size argument. Parameters to the ‘wmem’
|
functions must be of type ‘wchar_t *’. These functions are not really
|
usable with anything but arrays of this type.
|
|
In contrast, functions that operate specifically on strings and wide
|
strings have names beginning with ‘str’ and ‘wcs’ respectively (such as
|
‘strcpy’ and ‘wcscpy’) and look for a terminating null byte or null wide
|
character instead of requiring an explicit size argument to be passed.
|
(Some of these functions accept a specified maximum length, but they
|
also check for premature termination.) The array arguments and return
|
values for these functions have type ‘char *’ and ‘wchar_t *’
|
respectively, and the array elements are referred to as “bytes” and
|
“wide characters”.
|
|
In many cases, there are both ‘mem’ and ‘str’/‘wcs’ versions of a
|
function. The one that is more appropriate to use depends on the exact
|
situation. When your program is manipulating arbitrary arrays or blocks
|
of storage, then you should always use the ‘mem’ functions. On the
|
other hand, when you are manipulating strings it is usually more
|
convenient to use the ‘str’/‘wcs’ functions, unless you already know the
|
length of the string in advance. The ‘wmem’ functions should be used
|
for wide character arrays with known size.
|
|
Some of the memory and string functions take single characters as
|
arguments. Since a value of type ‘char’ is automatically promoted into
|
a value of type ‘int’ when used as a parameter, the functions are
|
declared with ‘int’ as the type of the parameter in question. In case
|
of the wide character functions the situation is similar: the parameter
|
type for a single wide character is ‘wint_t’ and not ‘wchar_t’. This
|
would for many implementations not be necessary since ‘wchar_t’ is large
|
enough to not be automatically promoted, but since the ISO C standard
|
does not require such a choice of types the ‘wint_t’ type is used.
|
|
|
File: libc.info, Node: String Length, Next: Copying Strings and Arrays, Prev: String/Array Conventions, Up: String and Array Utilities
|
|
5.3 String Length
|
=================
|
|
You can get the length of a string using the ‘strlen’ function. This
|
function is declared in the header file ‘string.h’.
|
|
-- Function: size_t strlen (const char *S)
|
Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety
|
Concepts::.
|
|
The ‘strlen’ function returns the length of the string S in bytes.
|
(In other words, it returns the offset of the terminating null byte
|
within the array.)
|
|
For example,
|
strlen ("hello, world")
|
⇒ 12
|
|
When applied to an array, the ‘strlen’ function returns the length
|
of the string stored there, not its allocated size. You can get
|
the allocated size of the array that holds a string using the
|
‘sizeof’ operator:
|
|
char string[32] = "hello, world";
|
sizeof (string)
|
⇒ 32
|
strlen (string)
|
⇒ 12
|
|
But beware, this will not work unless STRING is the array itself,
|
not a pointer to it. For example:
|
|
char string[32] = "hello, world";
|
char *ptr = string;
|
sizeof (string)
|
⇒ 32
|
sizeof (ptr)
|
⇒ 4 /* (on a machine with 4 byte pointers) */
|
|
This is an easy mistake to make when you are working with functions
|
that take string arguments; those arguments are always pointers,
|
not arrays.
|
|
It must also be noted that for multibyte encoded strings the return
|
value does not have to correspond to the number of characters in
|
the string. To get this value the string can be converted to wide
|
characters and ‘wcslen’ can be used or something like the following
|
code can be used:
|
|
/* The input is in ‘string’.
|
The length is expected in ‘n’. */
|
{
|
mbstate_t t;
|
char *scopy = string;
|
/* In initial state. */
|
memset (&t, '\0', sizeof (t));
|
/* Determine number of characters. */
|
n = mbsrtowcs (NULL, &scopy, strlen (scopy), &t);
|
}
|
|
This is cumbersome to do so if the number of characters (as opposed
|
to bytes) is needed often it is better to work with wide
|
characters.
|
|
The wide character equivalent is declared in ‘wchar.h’.
|
|
-- Function: size_t wcslen (const wchar_t *WS)
|
Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety
|
Concepts::.
|
|
The ‘wcslen’ function is the wide character equivalent to ‘strlen’.
|
The return value is the number of wide characters in the wide
|
string pointed to by WS (this is also the offset of the terminating
|
null wide character of WS).
|
|
Since there are no multi wide character sequences making up one
|
wide character the return value is not only the offset in the
|
array, it is also the number of wide characters.
|
|
This function was introduced in Amendment 1 to ISO C90.
|
|
-- Function: size_t strnlen (const char *S, size_t MAXLEN)
|
Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety
|
Concepts::.
|
|
If the array S of size MAXLEN contains a null byte, the ‘strnlen’
|
function returns the length of the string S in bytes. Otherwise it
|
returns MAXLEN. Therefore this function is equivalent to ‘(strlen
|
(S) < MAXLEN ? strlen (S) : MAXLEN)’ but it is more efficient and
|
works even if S is not null-terminated so long as MAXLEN does not
|
exceed the size of S’s array.
|
|
char string[32] = "hello, world";
|
strnlen (string, 32)
|
⇒ 12
|
strnlen (string, 5)
|
⇒ 5
|
|
This function is a GNU extension and is declared in ‘string.h’.
|
|
-- Function: size_t wcsnlen (const wchar_t *WS, size_t MAXLEN)
|
Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety
|
Concepts::.
|
|
‘wcsnlen’ is the wide character equivalent to ‘strnlen’. The
|
MAXLEN parameter specifies the maximum number of wide characters.
|
|
This function is a GNU extension and is declared in ‘wchar.h’.
|
|
|
File: libc.info, Node: Copying Strings and Arrays, Next: Concatenating Strings, Prev: String Length, Up: String and Array Utilities
|
|
5.4 Copying Strings and Arrays
|
==============================
|
|
You can use the functions described in this section to copy the contents
|
of strings, wide strings, and arrays. The ‘str’ and ‘mem’ functions are
|
declared in ‘string.h’ while the ‘w’ functions are declared in
|
‘wchar.h’.
|
|
A helpful way to remember the ordering of the arguments to the
|
functions in this section is that it corresponds to an assignment
|
expression, with the destination array specified to the left of the
|
source array. Most of these functions return the address of the
|
destination array; a few return the address of the destination’s
|
terminating null, or of just past the destination.
|
|
Most of these functions do not work properly if the source and
|
destination arrays overlap. For example, if the beginning of the
|
destination array overlaps the end of the source array, the original
|
contents of that part of the source array may get overwritten before it
|
is copied. Even worse, in the case of the string functions, the null
|
byte marking the end of the string may be lost, and the copy function
|
might get stuck in a loop trashing all the memory allocated to your
|
program.
|
|
All functions that have problems copying between overlapping arrays
|
are explicitly identified in this manual. In addition to functions in
|
this section, there are a few others like ‘sprintf’ (*note Formatted
|
Output Functions::) and ‘scanf’ (*note Formatted Input Functions::).
|
|
-- Function: void * memcpy (void *restrict TO, const void *restrict
|
FROM, size_t SIZE)
|
Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety
|
Concepts::.
|
|
The ‘memcpy’ function copies SIZE bytes from the object beginning
|
at FROM into the object beginning at TO. The behavior of this
|
function is undefined if the two arrays TO and FROM overlap; use
|
‘memmove’ instead if overlapping is possible.
|
|
The value returned by ‘memcpy’ is the value of TO.
|
|
Here is an example of how you might use ‘memcpy’ to copy the
|
contents of an array:
|
|
struct foo *oldarray, *newarray;
|
int arraysize;
|
…
|
memcpy (new, old, arraysize * sizeof (struct foo));
|
|
-- Function: wchar_t * wmemcpy (wchar_t *restrict WTO, const wchar_t
|
*restrict WFROM, size_t SIZE)
|
Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety
|
Concepts::.
|
|
The ‘wmemcpy’ function copies SIZE wide characters from the object
|
beginning at WFROM into the object beginning at WTO. The behavior
|
of this function is undefined if the two arrays WTO and WFROM
|
overlap; use ‘wmemmove’ instead if overlapping is possible.
|
|
The following is a possible implementation of ‘wmemcpy’ but there
|
are more optimizations possible.
|
|
wchar_t *
|
wmemcpy (wchar_t *restrict wto, const wchar_t *restrict wfrom,
|
size_t size)
|
{
|
return (wchar_t *) memcpy (wto, wfrom, size * sizeof (wchar_t));
|
}
|
|
The value returned by ‘wmemcpy’ is the value of WTO.
|
|
This function was introduced in Amendment 1 to ISO C90.
|
|
-- Function: void * mempcpy (void *restrict TO, const void *restrict
|
FROM, size_t SIZE)
|
Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety
|
Concepts::.
|
|
The ‘mempcpy’ function is nearly identical to the ‘memcpy’
|
function. It copies SIZE bytes from the object beginning at ‘from’
|
into the object pointed to by TO. But instead of returning the
|
value of TO it returns a pointer to the byte following the last
|
written byte in the object beginning at TO. I.e., the value is
|
‘((void *) ((char *) TO + SIZE))’.
|
|
This function is useful in situations where a number of objects
|
shall be copied to consecutive memory positions.
|
|
void *
|
combine (void *o1, size_t s1, void *o2, size_t s2)
|
{
|
void *result = malloc (s1 + s2);
|
if (result != NULL)
|
mempcpy (mempcpy (result, o1, s1), o2, s2);
|
return result;
|
}
|
|
This function is a GNU extension.
|
|
-- Function: wchar_t * wmempcpy (wchar_t *restrict WTO, const wchar_t
|
*restrict WFROM, size_t SIZE)
|
Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety
|
Concepts::.
|
|
The ‘wmempcpy’ function is nearly identical to the ‘wmemcpy’
|
function. It copies SIZE wide characters from the object beginning
|
at ‘wfrom’ into the object pointed to by WTO. But instead of
|
returning the value of WTO it returns a pointer to the wide
|
character following the last written wide character in the object
|
beginning at WTO. I.e., the value is ‘WTO + SIZE’.
|
|
This function is useful in situations where a number of objects
|
shall be copied to consecutive memory positions.
|
|
The following is a possible implementation of ‘wmemcpy’ but there
|
are more optimizations possible.
|
|
wchar_t *
|
wmempcpy (wchar_t *restrict wto, const wchar_t *restrict wfrom,
|
size_t size)
|
{
|
return (wchar_t *) mempcpy (wto, wfrom, size * sizeof (wchar_t));
|
}
|
|
This function is a GNU extension.
|
|
-- Function: void * memmove (void *TO, const void *FROM, size_t SIZE)
|
Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety
|
Concepts::.
|
|
‘memmove’ copies the SIZE bytes at FROM into the SIZE bytes at TO,
|
even if those two blocks of space overlap. In the case of overlap,
|
‘memmove’ is careful to copy the original values of the bytes in
|
the block at FROM, including those bytes which also belong to the
|
block at TO.
|
|
The value returned by ‘memmove’ is the value of TO.
|
|
-- Function: wchar_t * wmemmove (wchar_t *WTO, const wchar_t *WFROM,
|
size_t SIZE)
|
Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety
|
Concepts::.
|
|
‘wmemmove’ copies the SIZE wide characters at WFROM into the SIZE
|
wide characters at WTO, even if those two blocks of space overlap.
|
In the case of overlap, ‘wmemmove’ is careful to copy the original
|
values of the wide characters in the block at WFROM, including
|
those wide characters which also belong to the block at WTO.
|
|
The following is a possible implementation of ‘wmemcpy’ but there
|
are more optimizations possible.
|
|
wchar_t *
|
wmempcpy (wchar_t *restrict wto, const wchar_t *restrict wfrom,
|
size_t size)
|
{
|
return (wchar_t *) mempcpy (wto, wfrom, size * sizeof (wchar_t));
|
}
|
|
The value returned by ‘wmemmove’ is the value of WTO.
|
|
This function is a GNU extension.
|
|
-- Function: void * memccpy (void *restrict TO, const void *restrict
|
FROM, int C, size_t SIZE)
|
Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety
|
Concepts::.
|
|
This function copies no more than SIZE bytes from FROM to TO,
|
stopping if a byte matching C is found. The return value is a
|
pointer into TO one byte past where C was copied, or a null pointer
|
if no byte matching C appeared in the first SIZE bytes of FROM.
|
|
-- Function: void * memset (void *BLOCK, int C, size_t SIZE)
|
Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety
|
Concepts::.
|
|
This function copies the value of C (converted to an ‘unsigned
|
char’) into each of the first SIZE bytes of the object beginning at
|
BLOCK. It returns the value of BLOCK.
|
|
-- Function: wchar_t * wmemset (wchar_t *BLOCK, wchar_t WC, size_t
|
SIZE)
|
Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety
|
Concepts::.
|
|
This function copies the value of WC into each of the first SIZE
|
wide characters of the object beginning at BLOCK. It returns the
|
value of BLOCK.
|
|
-- Function: char * strcpy (char *restrict TO, const char *restrict
|
FROM)
|
Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety
|
Concepts::.
|
|
This copies bytes from the string FROM (up to and including the
|
terminating null byte) into the string TO. Like ‘memcpy’, this
|
function has undefined results if the strings overlap. The return
|
value is the value of TO.
|
|
-- Function: wchar_t * wcscpy (wchar_t *restrict WTO, const wchar_t
|
*restrict WFROM)
|
Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety
|
Concepts::.
|
|
This copies wide characters from the wide string WFROM (up to and
|
including the terminating null wide character) into the string WTO.
|
Like ‘wmemcpy’, this function has undefined results if the strings
|
overlap. The return value is the value of WTO.
|
|
-- Function: char * strdup (const char *S)
|
Preliminary: | MT-Safe | AS-Unsafe heap | AC-Unsafe mem | *Note
|
POSIX Safety Concepts::.
|
|
This function copies the string S into a newly allocated string.
|
The string is allocated using ‘malloc’; see *note Unconstrained
|
Allocation::. If ‘malloc’ cannot allocate space for the new
|
string, ‘strdup’ returns a null pointer. Otherwise it returns a
|
pointer to the new string.
|
|
-- Function: wchar_t * wcsdup (const wchar_t *WS)
|
Preliminary: | MT-Safe | AS-Unsafe heap | AC-Unsafe mem | *Note
|
POSIX Safety Concepts::.
|
|
This function copies the wide string WS into a newly allocated
|
string. The string is allocated using ‘malloc’; see *note
|
Unconstrained Allocation::. If ‘malloc’ cannot allocate space for
|
the new string, ‘wcsdup’ returns a null pointer. Otherwise it
|
returns a pointer to the new wide string.
|
|
This function is a GNU extension.
|
|
-- Function: char * stpcpy (char *restrict TO, const char *restrict
|
FROM)
|
Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety
|
Concepts::.
|
|
This function is like ‘strcpy’, except that it returns a pointer to
|
the end of the string TO (that is, the address of the terminating
|
null byte ‘to + strlen (from)’) rather than the beginning.
|
|
For example, this program uses ‘stpcpy’ to concatenate ‘foo’ and
|
‘bar’ to produce ‘foobar’, which it then prints.
|
|
|
#include <string.h>
|
#include <stdio.h>
|
|
int
|
main (void)
|
{
|
char buffer[10];
|
char *to = buffer;
|
to = stpcpy (to, "foo");
|
to = stpcpy (to, "bar");
|
puts (buffer);
|
return 0;
|
}
|
|
This function is part of POSIX.1-2008 and later editions, but was
|
available in the GNU C Library and other systems as an extension
|
long before it was standardized.
|
|
Its behavior is undefined if the strings overlap. The function is
|
declared in ‘string.h’.
|
|
-- Function: wchar_t * wcpcpy (wchar_t *restrict WTO, const wchar_t
|
*restrict WFROM)
|
Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety
|
Concepts::.
|
|
This function is like ‘wcscpy’, except that it returns a pointer to
|
the end of the string WTO (that is, the address of the terminating
|
null wide character ‘wto + wcslen (wfrom)’) rather than the
|
beginning.
|
|
This function is not part of ISO or POSIX but was found useful
|
while developing the GNU C Library itself.
|
|
The behavior of ‘wcpcpy’ is undefined if the strings overlap.
|
|
‘wcpcpy’ is a GNU extension and is declared in ‘wchar.h’.
|
|
-- Macro: char * strdupa (const char *S)
|
Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety
|
Concepts::.
|
|
This macro is similar to ‘strdup’ but allocates the new string
|
using ‘alloca’ instead of ‘malloc’ (*note Variable Size
|
Automatic::). This means of course the returned string has the
|
same limitations as any block of memory allocated using ‘alloca’.
|
|
For obvious reasons ‘strdupa’ is implemented only as a macro; you
|
cannot get the address of this function. Despite this limitation
|
it is a useful function. The following code shows a situation
|
where using ‘malloc’ would be a lot more expensive.
|
|
|
#include <paths.h>
|
#include <string.h>
|
#include <stdio.h>
|
|
const char path[] = _PATH_STDPATH;
|
|
int
|
main (void)
|
{
|
char *wr_path = strdupa (path);
|
char *cp = strtok (wr_path, ":");
|
|
while (cp != NULL)
|
{
|
puts (cp);
|
cp = strtok (NULL, ":");
|
}
|
return 0;
|
}
|
|
Please note that calling ‘strtok’ using PATH directly is invalid.
|
It is also not allowed to call ‘strdupa’ in the argument list of
|
‘strtok’ since ‘strdupa’ uses ‘alloca’ (*note Variable Size
|
Automatic::) can interfere with the parameter passing.
|
|
This function is only available if GNU CC is used.
|
|
-- Function: void bcopy (const void *FROM, void *TO, size_t SIZE)
|
Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety
|
Concepts::.
|
|
This is a partially obsolete alternative for ‘memmove’, derived
|
from BSD. Note that it is not quite equivalent to ‘memmove’,
|
because the arguments are not in the same order and there is no
|
return value.
|
|
-- Function: void bzero (void *BLOCK, size_t SIZE)
|
Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety
|
Concepts::.
|
|
This is a partially obsolete alternative for ‘memset’, derived from
|
BSD. Note that it is not as general as ‘memset’, because the only
|
value it can store is zero.
|
|
|
File: libc.info, Node: Concatenating Strings, Next: Truncating Strings, Prev: Copying Strings and Arrays, Up: String and Array Utilities
|
|
5.5 Concatenating Strings
|
=========================
|
|
The functions described in this section concatenate the contents of a
|
string or wide string to another. They follow the string-copying
|
functions in their conventions. *Note Copying Strings and Arrays::.
|
‘strcat’ is declared in the header file ‘string.h’ while ‘wcscat’ is
|
declared in ‘wchar.h’.
|
|
-- Function: char * strcat (char *restrict TO, const char *restrict
|
FROM)
|
Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety
|
Concepts::.
|
|
The ‘strcat’ function is similar to ‘strcpy’, except that the bytes
|
from FROM are concatenated or appended to the end of TO, instead of
|
overwriting it. That is, the first byte from FROM overwrites the
|
null byte marking the end of TO.
|
|
An equivalent definition for ‘strcat’ would be:
|
|
char *
|
strcat (char *restrict to, const char *restrict from)
|
{
|
strcpy (to + strlen (to), from);
|
return to;
|
}
|
|
This function has undefined results if the strings overlap.
|
|
As noted below, this function has significant performance issues.
|
|
-- Function: wchar_t * wcscat (wchar_t *restrict WTO, const wchar_t
|
*restrict WFROM)
|
Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety
|
Concepts::.
|
|
The ‘wcscat’ function is similar to ‘wcscpy’, except that the wide
|
characters from WFROM are concatenated or appended to the end of
|
WTO, instead of overwriting it. That is, the first wide character
|
from WFROM overwrites the null wide character marking the end of
|
WTO.
|
|
An equivalent definition for ‘wcscat’ would be:
|
|
wchar_t *
|
wcscat (wchar_t *wto, const wchar_t *wfrom)
|
{
|
wcscpy (wto + wcslen (wto), wfrom);
|
return wto;
|
}
|
|
This function has undefined results if the strings overlap.
|
|
As noted below, this function has significant performance issues.
|
|
Programmers using the ‘strcat’ or ‘wcscat’ function (or the ‘strncat’
|
or ‘wcsncat’ functions defined in a later section, for that matter) can
|
easily be recognized as lazy and reckless. In almost all situations the
|
lengths of the participating strings are known (it better should be
|
since how can one otherwise ensure the allocated size of the buffer is
|
sufficient?) Or at least, one could know them if one keeps track of the
|
results of the various function calls. But then it is very inefficient
|
to use ‘strcat’/‘wcscat’. A lot of time is wasted finding the end of
|
the destination string so that the actual copying can start. This is a
|
common example:
|
|
/* This function concatenates arbitrarily many strings. The last
|
parameter must be ‘NULL’. */
|
char *
|
concat (const char *str, …)
|
{
|
va_list ap, ap2;
|
size_t total = 1;
|
const char *s;
|
char *result;
|
|
va_start (ap, str);
|
va_copy (ap2, ap);
|
|
/* Determine how much space we need. */
|
for (s = str; s != NULL; s = va_arg (ap, const char *))
|
total += strlen (s);
|
|
va_end (ap);
|
|
result = (char *) malloc (total);
|
if (result != NULL)
|
{
|
result[0] = '\0';
|
|
/* Copy the strings. */
|
for (s = str; s != NULL; s = va_arg (ap2, const char *))
|
strcat (result, s);
|
}
|
|
va_end (ap2);
|
|
return result;
|
}
|
|
This looks quite simple, especially the second loop where the strings
|
are actually copied. But these innocent lines hide a major performance
|
penalty. Just imagine that ten strings of 100 bytes each have to be
|
concatenated. For the second string we search the already stored 100
|
bytes for the end of the string so that we can append the next string.
|
For all strings in total the comparisons necessary to find the end of
|
the intermediate results sums up to 5500! If we combine the copying
|
with the search for the allocation we can write this function more
|
efficiently:
|
|
char *
|
concat (const char *str, …)
|
{
|
va_list ap;
|
size_t allocated = 100;
|
char *result = (char *) malloc (allocated);
|
|
if (result != NULL)
|
{
|
char *newp;
|
char *wp;
|
const char *s;
|
|
va_start (ap, str);
|
|
wp = result;
|
for (s = str; s != NULL; s = va_arg (ap, const char *))
|
{
|
size_t len = strlen (s);
|
|
/* Resize the allocated memory if necessary. */
|
if (wp + len + 1 > result + allocated)
|
{
|
allocated = (allocated + len) * 2;
|
newp = (char *) realloc (result, allocated);
|
if (newp == NULL)
|
{
|
free (result);
|
return NULL;
|
}
|
wp = newp + (wp - result);
|
result = newp;
|
}
|
|
wp = mempcpy (wp, s, len);
|
}
|
|
/* Terminate the result string. */
|
*wp++ = '\0';
|
|
/* Resize memory to the optimal size. */
|
newp = realloc (result, wp - result);
|
if (newp != NULL)
|
result = newp;
|
|
va_end (ap);
|
}
|
|
return result;
|
}
|
|
With a bit more knowledge about the input strings one could fine-tune
|
the memory allocation. The difference we are pointing to here is that
|
we don’t use ‘strcat’ anymore. We always keep track of the length of
|
the current intermediate result so we can save ourselves the search for
|
the end of the string and use ‘mempcpy’. Please note that we also don’t
|
use ‘stpcpy’ which might seem more natural since we are handling
|
strings. But this is not necessary since we already know the length of
|
the string and therefore can use the faster memory copying function.
|
The example would work for wide characters the same way.
|
|
Whenever a programmer feels the need to use ‘strcat’ she or he should
|
think twice and look through the program to see whether the code cannot
|
be rewritten to take advantage of already calculated results. Again: it
|
is almost always unnecessary to use ‘strcat’.
|
|
|
File: libc.info, Node: Truncating Strings, Next: String/Array Comparison, Prev: Concatenating Strings, Up: String and Array Utilities
|
|
5.6 Truncating Strings while Copying
|
====================================
|
|
The functions described in this section copy or concatenate the
|
possibly-truncated contents of a string or array to another, and
|
similarly for wide strings. They follow the string-copying functions in
|
their header conventions. *Note Copying Strings and Arrays::. The
|
‘str’ functions are declared in the header file ‘string.h’ and the ‘wc’
|
functions are declared in the file ‘wchar.h’.
|
|
-- Function: char * strncpy (char *restrict TO, const char *restrict
|
FROM, size_t SIZE)
|
Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety
|
Concepts::.
|
|
This function is similar to ‘strcpy’ but always copies exactly SIZE
|
bytes into TO.
|
|
If FROM does not contain a null byte in its first SIZE bytes,
|
‘strncpy’ copies just the first SIZE bytes. In this case no null
|
terminator is written into TO.
|
|
Otherwise FROM must be a string with length less than SIZE. In
|
this case ‘strncpy’ copies all of FROM, followed by enough null
|
bytes to add up to SIZE bytes in all.
|
|
The behavior of ‘strncpy’ is undefined if the strings overlap.
|
|
This function was designed for now-rarely-used arrays consisting of
|
non-null bytes followed by zero or more null bytes. It needs to
|
set all SIZE bytes of the destination, even when SIZE is much
|
greater than the length of FROM. As noted below, this function is
|
generally a poor choice for processing text.
|
|
-- Function: wchar_t * wcsncpy (wchar_t *restrict WTO, const wchar_t
|
*restrict WFROM, size_t SIZE)
|
Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety
|
Concepts::.
|
|
This function is similar to ‘wcscpy’ but always copies exactly SIZE
|
wide characters into WTO.
|
|
If WFROM does not contain a null wide character in its first SIZE
|
wide characters, then ‘wcsncpy’ copies just the first SIZE wide
|
characters. In this case no null terminator is written into WTO.
|
|
Otherwise WFROM must be a wide string with length less than SIZE.
|
In this case ‘wcsncpy’ copies all of WFROM, followed by enough null
|
wide characters to add up to SIZE wide characters in all.
|
|
The behavior of ‘wcsncpy’ is undefined if the strings overlap.
|
|
This function is the wide-character counterpart of ‘strncpy’ and
|
suffers from most of the problems that ‘strncpy’ does. For
|
example, as noted below, this function is generally a poor choice
|
for processing text.
|
|
-- Function: char * strndup (const char *S, size_t SIZE)
|
Preliminary: | MT-Safe | AS-Unsafe heap | AC-Unsafe mem | *Note
|
POSIX Safety Concepts::.
|
|
This function is similar to ‘strdup’ but always copies at most SIZE
|
bytes into the newly allocated string.
|
|
If the length of S is more than SIZE, then ‘strndup’ copies just
|
the first SIZE bytes and adds a closing null byte. Otherwise all
|
bytes are copied and the string is terminated.
|
|
This function differs from ‘strncpy’ in that it always terminates
|
the destination string.
|
|
As noted below, this function is generally a poor choice for
|
processing text.
|
|
‘strndup’ is a GNU extension.
|
|
-- Macro: char * strndupa (const char *S, size_t SIZE)
|
Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety
|
Concepts::.
|
|
This function is similar to ‘strndup’ but like ‘strdupa’ it
|
allocates the new string using ‘alloca’ *note Variable Size
|
Automatic::. The same advantages and limitations of ‘strdupa’ are
|
valid for ‘strndupa’, too.
|
|
This function is implemented only as a macro, just like ‘strdupa’.
|
Just as ‘strdupa’ this macro also must not be used inside the
|
parameter list in a function call.
|
|
As noted below, this function is generally a poor choice for
|
processing text.
|
|
‘strndupa’ is only available if GNU CC is used.
|
|
-- Function: char * stpncpy (char *restrict TO, const char *restrict
|
FROM, size_t SIZE)
|
Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety
|
Concepts::.
|
|
This function is similar to ‘stpcpy’ but copies always exactly SIZE
|
bytes into TO.
|
|
If the length of FROM is more than SIZE, then ‘stpncpy’ copies just
|
the first SIZE bytes and returns a pointer to the byte directly
|
following the one which was copied last. Note that in this case
|
there is no null terminator written into TO.
|
|
If the length of FROM is less than SIZE, then ‘stpncpy’ copies all
|
of FROM, followed by enough null bytes to add up to SIZE bytes in
|
all. This behavior is rarely useful, but it is implemented to be
|
useful in contexts where this behavior of the ‘strncpy’ is used.
|
‘stpncpy’ returns a pointer to the _first_ written null byte.
|
|
This function is not part of ISO or POSIX but was found useful
|
while developing the GNU C Library itself.
|
|
Its behavior is undefined if the strings overlap. The function is
|
declared in ‘string.h’.
|
|
As noted below, this function is generally a poor choice for
|
processing text.
|
|
-- Function: wchar_t * wcpncpy (wchar_t *restrict WTO, const wchar_t
|
*restrict WFROM, size_t SIZE)
|
Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety
|
Concepts::.
|
|
This function is similar to ‘wcpcpy’ but copies always exactly
|
WSIZE wide characters into WTO.
|
|
If the length of WFROM is more than SIZE, then ‘wcpncpy’ copies
|
just the first SIZE wide characters and returns a pointer to the
|
wide character directly following the last non-null wide character
|
which was copied last. Note that in this case there is no null
|
terminator written into WTO.
|
|
If the length of WFROM is less than SIZE, then ‘wcpncpy’ copies all
|
of WFROM, followed by enough null wide characters to add up to SIZE
|
wide characters in all. This behavior is rarely useful, but it is
|
implemented to be useful in contexts where this behavior of the
|
‘wcsncpy’ is used. ‘wcpncpy’ returns a pointer to the _first_
|
written null wide character.
|
|
This function is not part of ISO or POSIX but was found useful
|
while developing the GNU C Library itself.
|
|
Its behavior is undefined if the strings overlap.
|
|
As noted below, this function is generally a poor choice for
|
processing text.
|
|
‘wcpncpy’ is a GNU extension.
|
|
-- Function: char * strncat (char *restrict TO, const char *restrict
|
FROM, size_t SIZE)
|
Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety
|
Concepts::.
|
|
This function is like ‘strcat’ except that not more than SIZE bytes
|
from FROM are appended to the end of TO, and FROM need not be
|
null-terminated. A single null byte is also always appended to TO,
|
so the total allocated size of TO must be at least ‘SIZE + 1’ bytes
|
longer than its initial length.
|
|
The ‘strncat’ function could be implemented like this:
|
|
char *
|
strncat (char *to, const char *from, size_t size)
|
{
|
size_t len = strlen (to);
|
memcpy (to + len, from, strnlen (from, size));
|
to[len + strnlen (from, size)] = '\0';
|
return to;
|
}
|
|
The behavior of ‘strncat’ is undefined if the strings overlap.
|
|
As a companion to ‘strncpy’, ‘strncat’ was designed for
|
now-rarely-used arrays consisting of non-null bytes followed by
|
zero or more null bytes. As noted below, this function is
|
generally a poor choice for processing text. Also, this function
|
has significant performance issues. *Note Concatenating Strings::.
|
|
-- Function: wchar_t * wcsncat (wchar_t *restrict WTO, const wchar_t
|
*restrict WFROM, size_t SIZE)
|
Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety
|
Concepts::.
|
|
This function is like ‘wcscat’ except that not more than SIZE wide
|
characters from FROM are appended to the end of TO, and FROM need
|
not be null-terminated. A single null wide character is also
|
always appended to TO, so the total allocated size of TO must be at
|
least ‘wcsnlen (WFROM, SIZE) + 1’ wide characters longer than its
|
initial length.
|
|
The ‘wcsncat’ function could be implemented like this:
|
|
wchar_t *
|
wcsncat (wchar_t *restrict wto, const wchar_t *restrict wfrom,
|
size_t size)
|
{
|
size_t len = wcslen (wto);
|
memcpy (wto + len, wfrom, wcsnlen (wfrom, size) * sizeof (wchar_t));
|
wto[len + wcsnlen (wfrom, size)] = L'\0';
|
return wto;
|
}
|
|
The behavior of ‘wcsncat’ is undefined if the strings overlap.
|
|
As noted below, this function is generally a poor choice for
|
processing text. Also, this function has significant performance
|
issues. *Note Concatenating Strings::.
|
|
Because these functions can abruptly truncate strings or wide
|
strings, they are generally poor choices for processing text. When
|
coping or concatening multibyte strings, they can truncate within a
|
multibyte character so that the result is not a valid multibyte string.
|
When combining or concatenating multibyte or wide strings, they may
|
truncate the output after a combining character, resulting in a
|
corrupted grapheme. They can cause bugs even when processing
|
single-byte strings: for example, when calculating an ASCII-only user
|
name, a truncated name can identify the wrong user.
|
|
Although some buffer overruns can be prevented by manually replacing
|
calls to copying functions with calls to truncation functions, there are
|
often easier and safer automatic techniques that cause buffer overruns
|
to reliably terminate a program, such as GCC’s ‘-fcheck-pointer-bounds’
|
and ‘-fsanitize=address’ options. *Note Options for Debugging Your
|
Program or GCC: (gcc.info)Debugging Options. Because truncation
|
functions can mask application bugs that would otherwise be caught by
|
the automatic techniques, these functions should be used only when the
|
application’s underlying logic requires truncation.
|
|
*Note:* GNU programs should not truncate strings or wide strings to
|
fit arbitrary size limits. *Note Writing Robust Programs:
|
(standards)Semantics. Instead of string-truncation functions, it is
|
usually better to use dynamic memory allocation (*note Unconstrained
|
Allocation::) and functions such as ‘strdup’ or ‘asprintf’ to construct
|
strings.
|
|
|
File: libc.info, Node: String/Array Comparison, Next: Collation Functions, Prev: Truncating Strings, Up: String and Array Utilities
|
|
5.7 String/Array Comparison
|
===========================
|
|
You can use the functions in this section to perform comparisons on the
|
contents of strings and arrays. As well as checking for equality, these
|
functions can also be used as the ordering functions for sorting
|
operations. *Note Searching and Sorting::, for an example of this.
|
|
Unlike most comparison operations in C, the string comparison
|
functions return a nonzero value if the strings are _not_ equivalent
|
rather than if they are. The sign of the value indicates the relative
|
ordering of the first part of the strings that are not equivalent: a
|
negative value indicates that the first string is “less” than the
|
second, while a positive value indicates that the first string is
|
“greater”.
|
|
The most common use of these functions is to check only for equality.
|
This is canonically done with an expression like ‘! strcmp (s1, s2)’.
|
|
All of these functions are declared in the header file ‘string.h’.
|
|
-- Function: int memcmp (const void *A1, const void *A2, size_t SIZE)
|
Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety
|
Concepts::.
|
|
The function ‘memcmp’ compares the SIZE bytes of memory beginning
|
at A1 against the SIZE bytes of memory beginning at A2. The value
|
returned has the same sign as the difference between the first
|
differing pair of bytes (interpreted as ‘unsigned char’ objects,
|
then promoted to ‘int’).
|
|
If the contents of the two blocks are equal, ‘memcmp’ returns ‘0’.
|
|
-- Function: int wmemcmp (const wchar_t *A1, const wchar_t *A2, size_t
|
SIZE)
|
Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety
|
Concepts::.
|
|
The function ‘wmemcmp’ compares the SIZE wide characters beginning
|
at A1 against the SIZE wide characters beginning at A2. The value
|
returned is smaller than or larger than zero depending on whether
|
the first differing wide character is A1 is smaller or larger than
|
the corresponding wide character in A2.
|
|
If the contents of the two blocks are equal, ‘wmemcmp’ returns ‘0’.
|
|
On arbitrary arrays, the ‘memcmp’ function is mostly useful for
|
testing equality. It usually isn’t meaningful to do byte-wise ordering
|
comparisons on arrays of things other than bytes. For example, a
|
byte-wise comparison on the bytes that make up floating-point numbers
|
isn’t likely to tell you anything about the relationship between the
|
values of the floating-point numbers.
|
|
‘wmemcmp’ is really only useful to compare arrays of type ‘wchar_t’
|
since the function looks at ‘sizeof (wchar_t)’ bytes at a time and this
|
number of bytes is system dependent.
|
|
You should also be careful about using ‘memcmp’ to compare objects
|
that can contain “holes”, such as the padding inserted into structure
|
objects to enforce alignment requirements, extra space at the end of
|
unions, and extra bytes at the ends of strings whose length is less than
|
their allocated size. The contents of these “holes” are indeterminate
|
and may cause strange behavior when performing byte-wise comparisons.
|
For more predictable results, perform an explicit component-wise
|
comparison.
|
|
For example, given a structure type definition like:
|
|
struct foo
|
{
|
unsigned char tag;
|
union
|
{
|
double f;
|
long i;
|
char *p;
|
} value;
|
};
|
|
you are better off writing a specialized comparison function to compare
|
‘struct foo’ objects instead of comparing them with ‘memcmp’.
|
|
-- Function: int strcmp (const char *S1, const char *S2)
|
Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety
|
Concepts::.
|
|
The ‘strcmp’ function compares the string S1 against S2, returning
|
a value that has the same sign as the difference between the first
|
differing pair of bytes (interpreted as ‘unsigned char’ objects,
|
then promoted to ‘int’).
|
|
If the two strings are equal, ‘strcmp’ returns ‘0’.
|
|
A consequence of the ordering used by ‘strcmp’ is that if S1 is an
|
initial substring of S2, then S1 is considered to be “less than”
|
S2.
|
|
‘strcmp’ does not take sorting conventions of the language the
|
strings are written in into account. To get that one has to use
|
‘strcoll’.
|
|
-- Function: int wcscmp (const wchar_t *WS1, const wchar_t *WS2)
|
Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety
|
Concepts::.
|
|
The ‘wcscmp’ function compares the wide string WS1 against WS2.
|
The value returned is smaller than or larger than zero depending on
|
whether the first differing wide character is WS1 is smaller or
|
larger than the corresponding wide character in WS2.
|
|
If the two strings are equal, ‘wcscmp’ returns ‘0’.
|
|
A consequence of the ordering used by ‘wcscmp’ is that if WS1 is an
|
initial substring of WS2, then WS1 is considered to be “less than”
|
WS2.
|
|
‘wcscmp’ does not take sorting conventions of the language the
|
strings are written in into account. To get that one has to use
|
‘wcscoll’.
|
|
-- Function: int strcasecmp (const char *S1, const char *S2)
|
Preliminary: | MT-Safe locale | AS-Safe | AC-Safe | *Note POSIX
|
Safety Concepts::.
|
|
This function is like ‘strcmp’, except that differences in case are
|
ignored, and its arguments must be multibyte strings. How
|
uppercase and lowercase characters are related is determined by the
|
currently selected locale. In the standard ‘"C"’ locale the
|
characters Ä and ä do not match but in a locale which regards these
|
characters as parts of the alphabet they do match.
|
|
‘strcasecmp’ is derived from BSD.
|
|
-- Function: int wcscasecmp (const wchar_t *WS1, const wchar_t *WS2)
|
Preliminary: | MT-Safe locale | AS-Safe | AC-Safe | *Note POSIX
|
Safety Concepts::.
|
|
This function is like ‘wcscmp’, except that differences in case are
|
ignored. How uppercase and lowercase characters are related is
|
determined by the currently selected locale. In the standard ‘"C"’
|
locale the characters Ä and ä do not match but in a locale which
|
regards these characters as parts of the alphabet they do match.
|
|
‘wcscasecmp’ is a GNU extension.
|
|
-- Function: int strncmp (const char *S1, const char *S2, size_t SIZE)
|
Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety
|
Concepts::.
|
|
This function is the similar to ‘strcmp’, except that no more than
|
SIZE bytes are compared. In other words, if the two strings are
|
the same in their first SIZE bytes, the return value is zero.
|
|
-- Function: int wcsncmp (const wchar_t *WS1, const wchar_t *WS2,
|
size_t SIZE)
|
Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety
|
Concepts::.
|
|
This function is similar to ‘wcscmp’, except that no more than SIZE
|
wide characters are compared. In other words, if the two strings
|
are the same in their first SIZE wide characters, the return value
|
is zero.
|
|
-- Function: int strncasecmp (const char *S1, const char *S2, size_t N)
|
Preliminary: | MT-Safe locale | AS-Safe | AC-Safe | *Note POSIX
|
Safety Concepts::.
|
|
This function is like ‘strncmp’, except that differences in case
|
are ignored, and the compared parts of the arguments should consist
|
of valid multibyte characters. Like ‘strcasecmp’, it is locale
|
dependent how uppercase and lowercase characters are related.
|
|
‘strncasecmp’ is a GNU extension.
|
|
-- Function: int wcsncasecmp (const wchar_t *WS1, const wchar_t *S2,
|
size_t N)
|
Preliminary: | MT-Safe locale | AS-Safe | AC-Safe | *Note POSIX
|
Safety Concepts::.
|
|
This function is like ‘wcsncmp’, except that differences in case
|
are ignored. Like ‘wcscasecmp’, it is locale dependent how
|
uppercase and lowercase characters are related.
|
|
‘wcsncasecmp’ is a GNU extension.
|
|
Here are some examples showing the use of ‘strcmp’ and ‘strncmp’
|
(equivalent examples can be constructed for the wide character
|
functions). These examples assume the use of the ASCII character set.
|
(If some other character set—say, EBCDIC—is used instead, then the
|
glyphs are associated with different numeric codes, and the return
|
values and ordering may differ.)
|
|
strcmp ("hello", "hello")
|
⇒ 0 /* These two strings are the same. */
|
strcmp ("hello", "Hello")
|
⇒ 32 /* Comparisons are case-sensitive. */
|
strcmp ("hello", "world")
|
⇒ -15 /* The byte ‘'h'’ comes before ‘'w'’. */
|
strcmp ("hello", "hello, world")
|
⇒ -44 /* Comparing a null byte against a comma. */
|
strncmp ("hello", "hello, world", 5)
|
⇒ 0 /* The initial 5 bytes are the same. */
|
strncmp ("hello, world", "hello, stupid world!!!", 5)
|
⇒ 0 /* The initial 5 bytes are the same. */
|
|
-- Function: int strverscmp (const char *S1, const char *S2)
|
Preliminary: | MT-Safe locale | AS-Safe | AC-Safe | *Note POSIX
|
Safety Concepts::.
|
|
The ‘strverscmp’ function compares the string S1 against S2,
|
considering them as holding indices/version numbers. The return
|
value follows the same conventions as found in the ‘strcmp’
|
function. In fact, if S1 and S2 contain no digits, ‘strverscmp’
|
behaves like ‘strcmp’ (in the sense that the sign of the result is
|
the same).
|
|
The comparison algorithm which the ‘strverscmp’ function implements
|
differs slightly from other version-comparison algorithms. The
|
implementation is based on a finite-state machine, whose behavior
|
is approximated below.
|
|
• The input strings are each split into sequences of non-digits
|
and digits. These sequences can be empty at the beginning and
|
end of the string. Digits are determined by the ‘isdigit’
|
function and are thus subject to the current locale.
|
|
• Comparison starts with a (possibly empty) non-digit sequence.
|
The first non-equal sequences of non-digits or digits
|
determines the outcome of the comparison.
|
|
• Corresponding non-digit sequences in both strings are compared
|
lexicographically if their lengths are equal. If the lengths
|
differ, the shorter non-digit sequence is extended with the
|
input string character immediately following it (which may be
|
the null terminator), the other sequence is truncated to be of
|
the same (extended) length, and these two sequences are
|
compared lexicographically. In the last case, the sequence
|
comparison determines the result of the function because the
|
extension character (or some character before it) is
|
necessarily different from the character at the same offset in
|
the other input string.
|
|
• For two sequences of digits, the number of leading zeros is
|
counted (which can be zero). If the count differs, the string
|
with more leading zeros in the digit sequence is considered
|
smaller than the other string.
|
|
• If the two sequences of digits have no leading zeros, they are
|
compared as integers, that is, the string with the longer
|
digit sequence is deemed larger, and if both sequences are of
|
equal length, they are compared lexicographically.
|
|
• If both digit sequences start with a zero and have an equal
|
number of leading zeros, they are compared lexicographically
|
if their lengths are the same. If the lengths differ, the
|
shorter sequence is extended with the following character in
|
its input string, and the other sequence is truncated to the
|
same length, and both sequences are compared lexicographically
|
(similar to the non-digit sequence case above).
|
|
The treatment of leading zeros and the tie-breaking extension
|
characters (which in effect propagate across non-digit/digit
|
sequence boundaries) differs from other version-comparison
|
algorithms.
|
|
strverscmp ("no digit", "no digit")
|
⇒ 0 /* same behavior as strcmp. */
|
strverscmp ("item#99", "item#100")
|
⇒ <0 /* same prefix, but 99 < 100. */
|
strverscmp ("alpha1", "alpha001")
|
⇒ >0 /* different number of leading zeros (0 and 2). */
|
strverscmp ("part1_f012", "part1_f01")
|
⇒ >0 /* lexicographical comparison with leading zeros. */
|
strverscmp ("foo.009", "foo.0")
|
⇒ <0 /* different number of leading zeros (2 and 1). */
|
|
‘strverscmp’ is a GNU extension.
|
|
-- Function: int bcmp (const void *A1, const void *A2, size_t SIZE)
|
Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety
|
Concepts::.
|
|
This is an obsolete alias for ‘memcmp’, derived from BSD.
|
|
|
File: libc.info, Node: Collation Functions, Next: Search Functions, Prev: String/Array Comparison, Up: String and Array Utilities
|
|
5.8 Collation Functions
|
=======================
|
|
In some locales, the conventions for lexicographic ordering differ from
|
the strict numeric ordering of character codes. For example, in Spanish
|
most glyphs with diacritical marks such as accents are not considered
|
distinct letters for the purposes of collation. On the other hand, the
|
two-character sequence ‘ll’ is treated as a single letter that is
|
collated immediately after ‘l’.
|
|
You can use the functions ‘strcoll’ and ‘strxfrm’ (declared in the
|
headers file ‘string.h’) and ‘wcscoll’ and ‘wcsxfrm’ (declared in the
|
headers file ‘wchar’) to compare strings using a collation ordering
|
appropriate for the current locale. The locale used by these functions
|
in particular can be specified by setting the locale for the
|
‘LC_COLLATE’ category; see *note Locales::.
|
|
In the standard C locale, the collation sequence for ‘strcoll’ is the
|
same as that for ‘strcmp’. Similarly, ‘wcscoll’ and ‘wcscmp’ are the
|
same in this situation.
|
|
Effectively, the way these functions work is by applying a mapping to
|
transform the characters in a multibyte string to a byte sequence that
|
represents the string’s position in the collating sequence of the
|
current locale. Comparing two such byte sequences in a simple fashion
|
is equivalent to comparing the strings with the locale’s collating
|
sequence.
|
|
The functions ‘strcoll’ and ‘wcscoll’ perform this translation
|
implicitly, in order to do one comparison. By contrast, ‘strxfrm’ and
|
‘wcsxfrm’ perform the mapping explicitly. If you are making multiple
|
comparisons using the same string or set of strings, it is likely to be
|
more efficient to use ‘strxfrm’ or ‘wcsxfrm’ to transform all the
|
strings just once, and subsequently compare the transformed strings with
|
‘strcmp’ or ‘wcscmp’.
|
|
-- Function: int strcoll (const char *S1, const char *S2)
|
Preliminary: | MT-Safe locale | AS-Unsafe heap | AC-Unsafe mem |
|
*Note POSIX Safety Concepts::.
|
|
The ‘strcoll’ function is similar to ‘strcmp’ but uses the
|
collating sequence of the current locale for collation (the
|
‘LC_COLLATE’ locale). The arguments are multibyte strings.
|
|
-- Function: int wcscoll (const wchar_t *WS1, const wchar_t *WS2)
|
Preliminary: | MT-Safe locale | AS-Unsafe heap | AC-Unsafe mem |
|
*Note POSIX Safety Concepts::.
|
|
The ‘wcscoll’ function is similar to ‘wcscmp’ but uses the
|
collating sequence of the current locale for collation (the
|
‘LC_COLLATE’ locale).
|
|
Here is an example of sorting an array of strings, using ‘strcoll’ to
|
compare them. The actual sort algorithm is not written here; it comes
|
from ‘qsort’ (*note Array Sort Function::). The job of the code shown
|
here is to say how to compare the strings while sorting them. (Later on
|
in this section, we will show a way to do this more efficiently using
|
‘strxfrm’.)
|
|
/* This is the comparison function used with ‘qsort’. */
|
|
int
|
compare_elements (const void *v1, const void *v2)
|
{
|
char * const *p1 = v1;
|
char * const *p2 = v2;
|
|
return strcoll (*p1, *p2);
|
}
|
|
/* This is the entry point—the function to sort
|
strings using the locale’s collating sequence. */
|
|
void
|
sort_strings (char **array, int nstrings)
|
{
|
/* Sort ‘temp_array’ by comparing the strings. */
|
qsort (array, nstrings,
|
sizeof (char *), compare_elements);
|
}
|
|
-- Function: size_t strxfrm (char *restrict TO, const char *restrict
|
FROM, size_t SIZE)
|
Preliminary: | MT-Safe locale | AS-Unsafe heap | AC-Unsafe mem |
|
*Note POSIX Safety Concepts::.
|
|
The function ‘strxfrm’ transforms the multibyte string FROM using
|
the collation transformation determined by the locale currently
|
selected for collation, and stores the transformed string in the
|
array TO. Up to SIZE bytes (including a terminating null byte) are
|
stored.
|
|
The behavior is undefined if the strings TO and FROM overlap; see
|
*note Copying Strings and Arrays::.
|
|
The return value is the length of the entire transformed string.
|
This value is not affected by the value of SIZE, but if it is
|
greater or equal than SIZE, it means that the transformed string
|
did not entirely fit in the array TO. In this case, only as much
|
of the string as actually fits was stored. To get the whole
|
transformed string, call ‘strxfrm’ again with a bigger output
|
array.
|
|
The transformed string may be longer than the original string, and
|
it may also be shorter.
|
|
If SIZE is zero, no bytes are stored in TO. In this case,
|
‘strxfrm’ simply returns the number of bytes that would be the
|
length of the transformed string. This is useful for determining
|
what size the allocated array should be. It does not matter what
|
TO is if SIZE is zero; TO may even be a null pointer.
|
|
-- Function: size_t wcsxfrm (wchar_t *restrict WTO, const wchar_t
|
*WFROM, size_t SIZE)
|
Preliminary: | MT-Safe locale | AS-Unsafe heap | AC-Unsafe mem |
|
*Note POSIX Safety Concepts::.
|
|
The function ‘wcsxfrm’ transforms wide string WFROM using the
|
collation transformation determined by the locale currently
|
selected for collation, and stores the transformed string in the
|
array WTO. Up to SIZE wide characters (including a terminating
|
null wide character) are stored.
|
|
The behavior is undefined if the strings WTO and WFROM overlap; see
|
*note Copying Strings and Arrays::.
|
|
The return value is the length of the entire transformed wide
|
string. This value is not affected by the value of SIZE, but if it
|
is greater or equal than SIZE, it means that the transformed wide
|
string did not entirely fit in the array WTO. In this case, only
|
as much of the wide string as actually fits was stored. To get the
|
whole transformed wide string, call ‘wcsxfrm’ again with a bigger
|
output array.
|
|
The transformed wide string may be longer than the original wide
|
string, and it may also be shorter.
|
|
If SIZE is zero, no wide characters are stored in TO. In this
|
case, ‘wcsxfrm’ simply returns the number of wide characters that
|
would be the length of the transformed wide string. This is useful
|
for determining what size the allocated array should be (remember
|
to multiply with ‘sizeof (wchar_t)’). It does not matter what WTO
|
is if SIZE is zero; WTO may even be a null pointer.
|
|
Here is an example of how you can use ‘strxfrm’ when you plan to do
|
many comparisons. It does the same thing as the previous example, but
|
much faster, because it has to transform each string only once, no
|
matter how many times it is compared with other strings. Even the time
|
needed to allocate and free storage is much less than the time we save,
|
when there are many strings.
|
|
struct sorter { char *input; char *transformed; };
|
|
/* This is the comparison function used with ‘qsort’
|
to sort an array of ‘struct sorter’. */
|
|
int
|
compare_elements (const void *v1, const void *v2)
|
{
|
const struct sorter *p1 = v1;
|
const struct sorter *p2 = v2;
|
|
return strcmp (p1->transformed, p2->transformed);
|
}
|
|
/* This is the entry point—the function to sort
|
strings using the locale’s collating sequence. */
|
|
void
|
sort_strings_fast (char **array, int nstrings)
|
{
|
struct sorter temp_array[nstrings];
|
int i;
|
|
/* Set up ‘temp_array’. Each element contains
|
one input string and its transformed string. */
|
for (i = 0; i < nstrings; i++)
|
{
|
size_t length = strlen (array[i]) * 2;
|
char *transformed;
|
size_t transformed_length;
|
|
temp_array[i].input = array[i];
|
|
/* First try a buffer perhaps big enough. */
|
transformed = (char *) xmalloc (length);
|
|
/* Transform ‘array[i]’. */
|
transformed_length = strxfrm (transformed, array[i], length);
|
|
/* If the buffer was not large enough, resize it
|
and try again. */
|
if (transformed_length >= length)
|
{
|
/* Allocate the needed space. +1 for terminating
|
‘'\0'’ byte. */
|
transformed = (char *) xrealloc (transformed,
|
transformed_length + 1);
|
|
/* The return value is not interesting because we know
|
how long the transformed string is. */
|
(void) strxfrm (transformed, array[i],
|
transformed_length + 1);
|
}
|
|
temp_array[i].transformed = transformed;
|
}
|
|
/* Sort ‘temp_array’ by comparing transformed strings. */
|
qsort (temp_array, nstrings,
|
sizeof (struct sorter), compare_elements);
|
|
/* Put the elements back in the permanent array
|
in their sorted order. */
|
for (i = 0; i < nstrings; i++)
|
array[i] = temp_array[i].input;
|
|
/* Free the strings we allocated. */
|
for (i = 0; i < nstrings; i++)
|
free (temp_array[i].transformed);
|
}
|
|
The interesting part of this code for the wide character version
|
would look like this:
|
|
void
|
sort_strings_fast (wchar_t **array, int nstrings)
|
{
|
…
|
/* Transform ‘array[i]’. */
|
transformed_length = wcsxfrm (transformed, array[i], length);
|
|
/* If the buffer was not large enough, resize it
|
and try again. */
|
if (transformed_length >= length)
|
{
|
/* Allocate the needed space. +1 for terminating
|
‘L'\0'’ wide character. */
|
transformed = (wchar_t *) xrealloc (transformed,
|
(transformed_length + 1)
|
* sizeof (wchar_t));
|
|
/* The return value is not interesting because we know
|
how long the transformed string is. */
|
(void) wcsxfrm (transformed, array[i],
|
transformed_length + 1);
|
}
|
…
|
|
Note the additional multiplication with ‘sizeof (wchar_t)’ in the
|
‘realloc’ call.
|
|
*Compatibility Note:* The string collation functions are a new
|
feature of ISO C90. Older C dialects have no equivalent feature. The
|
wide character versions were introduced in Amendment 1 to ISO C90.
|
|
|
File: libc.info, Node: Search Functions, Next: Finding Tokens in a String, Prev: Collation Functions, Up: String and Array Utilities
|
|
5.9 Search Functions
|
====================
|
|
This section describes library functions which perform various kinds of
|
searching operations on strings and arrays. These functions are
|
declared in the header file ‘string.h’.
|
|
-- Function: void * memchr (const void *BLOCK, int C, size_t SIZE)
|
Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety
|
Concepts::.
|
|
This function finds the first occurrence of the byte C (converted
|
to an ‘unsigned char’) in the initial SIZE bytes of the object
|
beginning at BLOCK. The return value is a pointer to the located
|
byte, or a null pointer if no match was found.
|
|
-- Function: wchar_t * wmemchr (const wchar_t *BLOCK, wchar_t WC,
|
size_t SIZE)
|
Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety
|
Concepts::.
|
|
This function finds the first occurrence of the wide character WC
|
in the initial SIZE wide characters of the object beginning at
|
BLOCK. The return value is a pointer to the located wide
|
character, or a null pointer if no match was found.
|
|
-- Function: void * rawmemchr (const void *BLOCK, int C)
|
Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety
|
Concepts::.
|
|
Often the ‘memchr’ function is used with the knowledge that the
|
byte C is available in the memory block specified by the
|
parameters. But this means that the SIZE parameter is not really
|
needed and that the tests performed with it at runtime (to check
|
whether the end of the block is reached) are not needed.
|
|
The ‘rawmemchr’ function exists for just this situation which is
|
surprisingly frequent. The interface is similar to ‘memchr’ except
|
that the SIZE parameter is missing. The function will look beyond
|
the end of the block pointed to by BLOCK in case the programmer
|
made an error in assuming that the byte C is present in the block.
|
In this case the result is unspecified. Otherwise the return value
|
is a pointer to the located byte.
|
|
This function is of special interest when looking for the end of a
|
string. Since all strings are terminated by a null byte a call
|
like
|
|
rawmemchr (str, '\0')
|
|
will never go beyond the end of the string.
|
|
This function is a GNU extension.
|
|
-- Function: void * memrchr (const void *BLOCK, int C, size_t SIZE)
|
Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety
|
Concepts::.
|
|
The function ‘memrchr’ is like ‘memchr’, except that it searches
|
backwards from the end of the block defined by BLOCK and SIZE
|
(instead of forwards from the front).
|
|
This function is a GNU extension.
|
|
-- Function: char * strchr (const char *STRING, int C)
|
Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety
|
Concepts::.
|
|
The ‘strchr’ function finds the first occurrence of the byte C
|
(converted to a ‘char’) in the string beginning at STRING. The
|
return value is a pointer to the located byte, or a null pointer if
|
no match was found.
|
|
For example,
|
strchr ("hello, world", 'l')
|
⇒ "llo, world"
|
strchr ("hello, world", '?')
|
⇒ NULL
|
|
The terminating null byte is considered to be part of the string,
|
so you can use this function get a pointer to the end of a string
|
by specifying zero as the value of the C argument.
|
|
When ‘strchr’ returns a null pointer, it does not let you know the
|
position of the terminating null byte it has found. If you need
|
that information, it is better (but less portable) to use
|
‘strchrnul’ than to search for it a second time.
|
|
-- Function: wchar_t * wcschr (const wchar_t *WSTRING, int WC)
|
Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety
|
Concepts::.
|
|
The ‘wcschr’ function finds the first occurrence of the wide
|
character WC in the wide string beginning at WSTRING. The return
|
value is a pointer to the located wide character, or a null pointer
|
if no match was found.
|
|
The terminating null wide character is considered to be part of the
|
wide string, so you can use this function get a pointer to the end
|
of a wide string by specifying a null wide character as the value
|
of the WC argument. It would be better (but less portable) to use
|
‘wcschrnul’ in this case, though.
|
|
-- Function: char * strchrnul (const char *STRING, int C)
|
Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety
|
Concepts::.
|
|
‘strchrnul’ is the same as ‘strchr’ except that if it does not find
|
the byte, it returns a pointer to string’s terminating null byte
|
rather than a null pointer.
|
|
This function is a GNU extension.
|
|
-- Function: wchar_t * wcschrnul (const wchar_t *WSTRING, wchar_t WC)
|
Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety
|
Concepts::.
|
|
‘wcschrnul’ is the same as ‘wcschr’ except that if it does not find
|
the wide character, it returns a pointer to the wide string’s
|
terminating null wide character rather than a null pointer.
|
|
This function is a GNU extension.
|
|
One useful, but unusual, use of the ‘strchr’ function is when one
|
wants to have a pointer pointing to the null byte terminating a string.
|
This is often written in this way:
|
|
s += strlen (s);
|
|
This is almost optimal but the addition operation duplicated a bit of
|
the work already done in the ‘strlen’ function. A better solution is
|
this:
|
|
s = strchr (s, '\0');
|
|
There is no restriction on the second parameter of ‘strchr’ so it
|
could very well also be zero. Those readers thinking very hard about
|
this might now point out that the ‘strchr’ function is more expensive
|
than the ‘strlen’ function since we have two abort criteria. This is
|
right. But in the GNU C Library the implementation of ‘strchr’ is
|
optimized in a special way so that ‘strchr’ actually is faster.
|
|
-- Function: char * strrchr (const char *STRING, int C)
|
Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety
|
Concepts::.
|
|
The function ‘strrchr’ is like ‘strchr’, except that it searches
|
backwards from the end of the string STRING (instead of forwards
|
from the front).
|
|
For example,
|
strrchr ("hello, world", 'l')
|
⇒ "ld"
|
|
-- Function: wchar_t * wcsrchr (const wchar_t *WSTRING, wchar_t C)
|
Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety
|
Concepts::.
|
|
The function ‘wcsrchr’ is like ‘wcschr’, except that it searches
|
backwards from the end of the string WSTRING (instead of forwards
|
from the front).
|
|
-- Function: char * strstr (const char *HAYSTACK, const char *NEEDLE)
|
Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety
|
Concepts::.
|
|
This is like ‘strchr’, except that it searches HAYSTACK for a
|
substring NEEDLE rather than just a single byte. It returns a
|
pointer into the string HAYSTACK that is the first byte of the
|
substring, or a null pointer if no match was found. If NEEDLE is
|
an empty string, the function returns HAYSTACK.
|
|
For example,
|
strstr ("hello, world", "l")
|
⇒ "llo, world"
|
strstr ("hello, world", "wo")
|
⇒ "world"
|
|
-- Function: wchar_t * wcsstr (const wchar_t *HAYSTACK, const wchar_t
|
*NEEDLE)
|
Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety
|
Concepts::.
|
|
This is like ‘wcschr’, except that it searches HAYSTACK for a
|
substring NEEDLE rather than just a single wide character. It
|
returns a pointer into the string HAYSTACK that is the first wide
|
character of the substring, or a null pointer if no match was
|
found. If NEEDLE is an empty string, the function returns
|
HAYSTACK.
|
|
-- Function: wchar_t * wcswcs (const wchar_t *HAYSTACK, const wchar_t
|
*NEEDLE)
|
Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety
|
Concepts::.
|
|
‘wcswcs’ is a deprecated alias for ‘wcsstr’. This is the name
|
originally used in the X/Open Portability Guide before the Amendment 1
|
to ISO C90 was published.
|
|
-- Function: char * strcasestr (const char *HAYSTACK, const char
|
*NEEDLE)
|
Preliminary: | MT-Safe locale | AS-Safe | AC-Safe | *Note POSIX
|
Safety Concepts::.
|
|
This is like ‘strstr’, except that it ignores case in searching for
|
the substring. Like ‘strcasecmp’, it is locale dependent how
|
uppercase and lowercase characters are related, and arguments are
|
multibyte strings.
|
|
For example,
|
strcasestr ("hello, world", "L")
|
⇒ "llo, world"
|
strcasestr ("hello, World", "wo")
|
⇒ "World"
|
|
-- Function: void * memmem (const void *HAYSTACK, size_t HAYSTACK-LEN,
|
const void *NEEDLE, size_t NEEDLE-LEN)
|
Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety
|
Concepts::.
|
|
This is like ‘strstr’, but NEEDLE and HAYSTACK are byte arrays
|
rather than strings. NEEDLE-LEN is the length of NEEDLE and
|
HAYSTACK-LEN is the length of HAYSTACK.
|
|
This function is a GNU extension.
|
|
-- Function: size_t strspn (const char *STRING, const char *SKIPSET)
|
Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety
|
Concepts::.
|
|
The ‘strspn’ (“string span”) function returns the length of the
|
initial substring of STRING that consists entirely of bytes that
|
are members of the set specified by the string SKIPSET. The order
|
of the bytes in SKIPSET is not important.
|
|
For example,
|
strspn ("hello, world", "abcdefghijklmnopqrstuvwxyz")
|
⇒ 5
|
|
In a multibyte string, characters consisting of more than one byte
|
are not treated as single entities. Each byte is treated
|
separately. The function is not locale-dependent.
|
|
-- Function: size_t wcsspn (const wchar_t *WSTRING, const wchar_t
|
*SKIPSET)
|
Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety
|
Concepts::.
|
|
The ‘wcsspn’ (“wide character string span”) function returns the
|
length of the initial substring of WSTRING that consists entirely
|
of wide characters that are members of the set specified by the
|
string SKIPSET. The order of the wide characters in SKIPSET is not
|
important.
|
|
-- Function: size_t strcspn (const char *STRING, const char *STOPSET)
|
Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety
|
Concepts::.
|
|
The ‘strcspn’ (“string complement span”) function returns the
|
length of the initial substring of STRING that consists entirely of
|
bytes that are _not_ members of the set specified by the string
|
STOPSET. (In other words, it returns the offset of the first byte
|
in STRING that is a member of the set STOPSET.)
|
|
For example,
|
strcspn ("hello, world", " \t\n,.;!?")
|
⇒ 5
|
|
In a multibyte string, characters consisting of more than one byte
|
are not treated as a single entities. Each byte is treated
|
separately. The function is not locale-dependent.
|
|
-- Function: size_t wcscspn (const wchar_t *WSTRING, const wchar_t
|
*STOPSET)
|
Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety
|
Concepts::.
|
|
The ‘wcscspn’ (“wide character string complement span”) function
|
returns the length of the initial substring of WSTRING that
|
consists entirely of wide characters that are _not_ members of the
|
set specified by the string STOPSET. (In other words, it returns
|
the offset of the first wide character in STRING that is a member
|
of the set STOPSET.)
|
|
-- Function: char * strpbrk (const char *STRING, const char *STOPSET)
|
Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety
|
Concepts::.
|
|
The ‘strpbrk’ (“string pointer break”) function is related to
|
‘strcspn’, except that it returns a pointer to the first byte in
|
STRING that is a member of the set STOPSET instead of the length of
|
the initial substring. It returns a null pointer if no such byte
|
from STOPSET is found.
|
|
For example,
|
|
strpbrk ("hello, world", " \t\n,.;!?")
|
⇒ ", world"
|
|
In a multibyte string, characters consisting of more than one byte
|
are not treated as single entities. Each byte is treated
|
separately. The function is not locale-dependent.
|
|
-- Function: wchar_t * wcspbrk (const wchar_t *WSTRING, const wchar_t
|
*STOPSET)
|
Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety
|
Concepts::.
|
|
The ‘wcspbrk’ (“wide character string pointer break”) function is
|
related to ‘wcscspn’, except that it returns a pointer to the first
|
wide character in WSTRING that is a member of the set STOPSET
|
instead of the length of the initial substring. It returns a null
|
pointer if no such wide character from STOPSET is found.
|
|
5.9.1 Compatibility String Search Functions
|
-------------------------------------------
|
|
-- Function: char * index (const char *STRING, int C)
|
Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety
|
Concepts::.
|
|
‘index’ is another name for ‘strchr’; they are exactly the same.
|
New code should always use ‘strchr’ since this name is defined in ISO C
|
while ‘index’ is a BSD invention which never was available on System V
|
derived systems.
|
|
-- Function: char * rindex (const char *STRING, int C)
|
Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety
|
Concepts::.
|
|
‘rindex’ is another name for ‘strrchr’; they are exactly the same.
|
New code should always use ‘strrchr’ since this name is defined in ISO C
|
while ‘rindex’ is a BSD invention which never was available on System V
|
derived systems.
|
|
|
File: libc.info, Node: Finding Tokens in a String, Next: Erasing Sensitive Data, Prev: Search Functions, Up: String and Array Utilities
|
|
5.10 Finding Tokens in a String
|
===============================
|
|
It’s fairly common for programs to have a need to do some simple kinds
|
of lexical analysis and parsing, such as splitting a command string up
|
into tokens. You can do this with the ‘strtok’ function, declared in
|
the header file ‘string.h’.
|
|
-- Function: char * strtok (char *restrict NEWSTRING, const char
|
*restrict DELIMITERS)
|
Preliminary: | MT-Unsafe race:strtok | AS-Unsafe | AC-Safe | *Note
|
POSIX Safety Concepts::.
|
|
A string can be split into tokens by making a series of calls to
|
the function ‘strtok’.
|
|
The string to be split up is passed as the NEWSTRING argument on
|
the first call only. The ‘strtok’ function uses this to set up
|
some internal state information. Subsequent calls to get
|
additional tokens from the same string are indicated by passing a
|
null pointer as the NEWSTRING argument. Calling ‘strtok’ with
|
another non-null NEWSTRING argument reinitializes the state
|
information. It is guaranteed that no other library function ever
|
calls ‘strtok’ behind your back (which would mess up this internal
|
state information).
|
|
The DELIMITERS argument is a string that specifies a set of
|
delimiters that may surround the token being extracted. All the
|
initial bytes that are members of this set are discarded. The
|
first byte that is _not_ a member of this set of delimiters marks
|
the beginning of the next token. The end of the token is found by
|
looking for the next byte that is a member of the delimiter set.
|
This byte in the original string NEWSTRING is overwritten by a null
|
byte, and the pointer to the beginning of the token in NEWSTRING is
|
returned.
|
|
On the next call to ‘strtok’, the searching begins at the next byte
|
beyond the one that marked the end of the previous token. Note
|
that the set of delimiters DELIMITERS do not have to be the same on
|
every call in a series of calls to ‘strtok’.
|
|
If the end of the string NEWSTRING is reached, or if the remainder
|
of string consists only of delimiter bytes, ‘strtok’ returns a null
|
pointer.
|
|
In a multibyte string, characters consisting of more than one byte
|
are not treated as single entities. Each byte is treated
|
separately. The function is not locale-dependent.
|
|
-- Function: wchar_t * wcstok (wchar_t *NEWSTRING, const wchar_t
|
*DELIMITERS, wchar_t **SAVE_PTR)
|
Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety
|
Concepts::.
|
|
A string can be split into tokens by making a series of calls to
|
the function ‘wcstok’.
|
|
The string to be split up is passed as the NEWSTRING argument on
|
the first call only. The ‘wcstok’ function uses this to set up
|
some internal state information. Subsequent calls to get
|
additional tokens from the same wide string are indicated by
|
passing a null pointer as the NEWSTRING argument, which causes the
|
pointer previously stored in SAVE_PTR to be used instead.
|
|
The DELIMITERS argument is a wide string that specifies a set of
|
delimiters that may surround the token being extracted. All the
|
initial wide characters that are members of this set are discarded.
|
The first wide character that is _not_ a member of this set of
|
delimiters marks the beginning of the next token. The end of the
|
token is found by looking for the next wide character that is a
|
member of the delimiter set. This wide character in the original
|
wide string NEWSTRING is overwritten by a null wide character, the
|
pointer past the overwritten wide character is saved in SAVE_PTR,
|
and the pointer to the beginning of the token in NEWSTRING is
|
returned.
|
|
On the next call to ‘wcstok’, the searching begins at the next wide
|
character beyond the one that marked the end of the previous token.
|
Note that the set of delimiters DELIMITERS do not have to be the
|
same on every call in a series of calls to ‘wcstok’.
|
|
If the end of the wide string NEWSTRING is reached, or if the
|
remainder of string consists only of delimiter wide characters,
|
‘wcstok’ returns a null pointer.
|
|
*Warning:* Since ‘strtok’ and ‘wcstok’ alter the string they is
|
parsing, you should always copy the string to a temporary buffer before
|
parsing it with ‘strtok’/‘wcstok’ (*note Copying Strings and Arrays::).
|
If you allow ‘strtok’ or ‘wcstok’ to modify a string that came from
|
another part of your program, you are asking for trouble; that string
|
might be used for other purposes after ‘strtok’ or ‘wcstok’ has modified
|
it, and it would not have the expected value.
|
|
The string that you are operating on might even be a constant. Then
|
when ‘strtok’ or ‘wcstok’ tries to modify it, your program will get a
|
fatal signal for writing in read-only memory. *Note Program Error
|
Signals::. Even if the operation of ‘strtok’ or ‘wcstok’ would not
|
require a modification of the string (e.g., if there is exactly one
|
token) the string can (and in the GNU C Library case will) be modified.
|
|
This is a special case of a general principle: if a part of a program
|
does not have as its purpose the modification of a certain data
|
structure, then it is error-prone to modify the data structure
|
temporarily.
|
|
The function ‘strtok’ is not reentrant, whereas ‘wcstok’ is. *Note
|
Nonreentrancy::, for a discussion of where and why reentrancy is
|
important.
|
|
Here is a simple example showing the use of ‘strtok’.
|
|
#include <string.h>
|
#include <stddef.h>
|
|
…
|
|
const char string[] = "words separated by spaces -- and, punctuation!";
|
const char delimiters[] = " .,;:!-";
|
char *token, *cp;
|
|
…
|
|
cp = strdupa (string); /* Make writable copy. */
|
token = strtok (cp, delimiters); /* token => "words" */
|
token = strtok (NULL, delimiters); /* token => "separated" */
|
token = strtok (NULL, delimiters); /* token => "by" */
|
token = strtok (NULL, delimiters); /* token => "spaces" */
|
token = strtok (NULL, delimiters); /* token => "and" */
|
token = strtok (NULL, delimiters); /* token => "punctuation" */
|
token = strtok (NULL, delimiters); /* token => NULL */
|
|
The GNU C Library contains two more functions for tokenizing a string
|
which overcome the limitation of non-reentrancy. They are not available
|
available for wide strings.
|
|
-- Function: char * strtok_r (char *NEWSTRING, const char *DELIMITERS,
|
char **SAVE_PTR)
|
Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety
|
Concepts::.
|
|
Just like ‘strtok’, this function splits the string into several
|
tokens which can be accessed by successive calls to ‘strtok_r’.
|
The difference is that, as in ‘wcstok’, the information about the
|
next token is stored in the space pointed to by the third argument,
|
SAVE_PTR, which is a pointer to a string pointer. Calling
|
‘strtok_r’ with a null pointer for NEWSTRING and leaving SAVE_PTR
|
between the calls unchanged does the job without hindering
|
reentrancy.
|
|
This function is defined in POSIX.1 and can be found on many
|
systems which support multi-threading.
|
|
-- Function: char * strsep (char **STRING_PTR, const char *DELIMITER)
|
Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety
|
Concepts::.
|
|
This function has a similar functionality as ‘strtok_r’ with the
|
NEWSTRING argument replaced by the SAVE_PTR argument. The
|
initialization of the moving pointer has to be done by the user.
|
Successive calls to ‘strsep’ move the pointer along the tokens
|
separated by DELIMITER, returning the address of the next token and
|
updating STRING_PTR to point to the beginning of the next token.
|
|
One difference between ‘strsep’ and ‘strtok_r’ is that if the input
|
string contains more than one byte from DELIMITER in a row ‘strsep’
|
returns an empty string for each pair of bytes from DELIMITER.
|
This means that a program normally should test for ‘strsep’
|
returning an empty string before processing it.
|
|
This function was introduced in 4.3BSD and therefore is widely
|
available.
|
|
Here is how the above example looks like when ‘strsep’ is used.
|
|
#include <string.h>
|
#include <stddef.h>
|
|
…
|
|
const char string[] = "words separated by spaces -- and, punctuation!";
|
const char delimiters[] = " .,;:!-";
|
char *running;
|
char *token;
|
|
…
|
|
running = strdupa (string);
|
token = strsep (&running, delimiters); /* token => "words" */
|
token = strsep (&running, delimiters); /* token => "separated" */
|
token = strsep (&running, delimiters); /* token => "by" */
|
token = strsep (&running, delimiters); /* token => "spaces" */
|
token = strsep (&running, delimiters); /* token => "" */
|
token = strsep (&running, delimiters); /* token => "" */
|
token = strsep (&running, delimiters); /* token => "" */
|
token = strsep (&running, delimiters); /* token => "and" */
|
token = strsep (&running, delimiters); /* token => "" */
|
token = strsep (&running, delimiters); /* token => "punctuation" */
|
token = strsep (&running, delimiters); /* token => "" */
|
token = strsep (&running, delimiters); /* token => NULL */
|
|
-- Function: char * basename (const char *FILENAME)
|
Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety
|
Concepts::.
|
|
The GNU version of the ‘basename’ function returns the last
|
component of the path in FILENAME. This function is the preferred
|
usage, since it does not modify the argument, FILENAME, and
|
respects trailing slashes. The prototype for ‘basename’ can be
|
found in ‘string.h’. Note, this function is overridden by the XPG
|
version, if ‘libgen.h’ is included.
|
|
Example of using GNU ‘basename’:
|
|
#include <string.h>
|
|
int
|
main (int argc, char *argv[])
|
{
|
char *prog = basename (argv[0]);
|
|
if (argc < 2)
|
{
|
fprintf (stderr, "Usage %s <arg>\n", prog);
|
exit (1);
|
}
|
|
…
|
}
|
|
*Portability Note:* This function may produce different results on
|
different systems.
|
|
-- Function: char * basename (char *PATH)
|
Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety
|
Concepts::.
|
|
This is the standard XPG defined ‘basename’. It is similar in
|
spirit to the GNU version, but may modify the PATH by removing
|
trailing ’/’ bytes. If the PATH is made up entirely of ’/’ bytes,
|
then "/" will be returned. Also, if PATH is ‘NULL’ or an empty
|
string, then "." is returned. The prototype for the XPG version
|
can be found in ‘libgen.h’.
|
|
Example of using XPG ‘basename’:
|
|
#include <libgen.h>
|
|
int
|
main (int argc, char *argv[])
|
{
|
char *prog;
|
char *path = strdupa (argv[0]);
|
|
prog = basename (path);
|
|
if (argc < 2)
|
{
|
fprintf (stderr, "Usage %s <arg>\n", prog);
|
exit (1);
|
}
|
|
…
|
|
}
|
|
-- Function: char * dirname (char *PATH)
|
Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety
|
Concepts::.
|
|
The ‘dirname’ function is the compliment to the XPG version of
|
‘basename’. It returns the parent directory of the file specified
|
by PATH. If PATH is ‘NULL’, an empty string, or contains no ’/’
|
bytes, then "." is returned. The prototype for this function can
|
be found in ‘libgen.h’.
|
|
|
File: libc.info, Node: Erasing Sensitive Data, Next: strfry, Prev: Finding Tokens in a String, Up: String and Array Utilities
|
|
5.11 Erasing Sensitive Data
|
===========================
|
|
Sensitive data, such as cryptographic keys, should be erased from memory
|
after use, to reduce the risk that a bug will expose it to the outside
|
world. However, compiler optimizations may determine that an erasure
|
operation is “unnecessary,” and remove it from the generated code,
|
because no _correct_ program could access the variable or heap object
|
containing the sensitive data after it’s deallocated. Since erasure is
|
a precaution against bugs, this optimization is inappropriate.
|
|
The function ‘explicit_bzero’ erases a block of memory, and
|
guarantees that the compiler will not remove the erasure as
|
“unnecessary.”
|
|
#include <string.h>
|
|
extern void encrypt (const char *key, const char *in,
|
char *out, size_t n);
|
extern void genkey (const char *phrase, char *key);
|
|
void encrypt_with_phrase (const char *phrase, const char *in,
|
char *out, size_t n)
|
{
|
char key[16];
|
genkey (phrase, key);
|
encrypt (key, in, out, n);
|
explicit_bzero (key, 16);
|
}
|
|
In this example, if ‘memset’, ‘bzero’, or a hand-written loop had been
|
used, the compiler might remove them as “unnecessary.”
|
|
*Warning:* ‘explicit_bzero’ does not guarantee that sensitive data is
|
_completely_ erased from the computer’s memory. There may be copies in
|
temporary storage areas, such as registers and “scratch” stack space;
|
since these are invisible to the source code, a library function cannot
|
erase them.
|
|
Also, ‘explicit_bzero’ only operates on RAM. If a sensitive data
|
object never needs to have its address taken other than to call
|
‘explicit_bzero’, it might be stored entirely in CPU registers _until_
|
the call to ‘explicit_bzero’. Then it will be copied into RAM, the copy
|
will be erased, and the original will remain intact. Data in RAM is
|
more likely to be exposed by a bug than data in registers, so this
|
creates a brief window where the data is at greater risk of exposure
|
than it would have been if the program didn’t try to erase it at all.
|
|
Declaring sensitive variables as ‘volatile’ will make both the above
|
problems _worse_; a ‘volatile’ variable will be stored in memory for its
|
entire lifetime, and the compiler will make _more_ copies of it than it
|
would otherwise have. Attempting to erase a normal variable “by hand”
|
through a ‘volatile’-qualified pointer doesn’t work at all—because the
|
variable itself is not ‘volatile’, some compilers will ignore the
|
qualification on the pointer and remove the erasure anyway.
|
|
Having said all that, in most situations, using ‘explicit_bzero’ is
|
better than not using it. At present, the only way to do a more
|
thorough job is to write the entire sensitive operation in assembly
|
language. We anticipate that future compilers will recognize calls to
|
‘explicit_bzero’ and take appropriate steps to erase all the copies of
|
the affected data, whereever they may be.
|
|
-- Function: void explicit_bzero (void *BLOCK, size_t LEN)
|
Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety
|
Concepts::.
|
|
‘explicit_bzero’ writes zero into LEN bytes of memory beginning at
|
BLOCK, just as ‘bzero’ would. The zeroes are always written, even
|
if the compiler could determine that this is “unnecessary” because
|
no correct program could read them back.
|
|
*Note:* The _only_ optimization that ‘explicit_bzero’ disables is
|
removal of “unnecessary” writes to memory. The compiler can
|
perform all the other optimizations that it could for a call to
|
‘memset’. For instance, it may replace the function call with
|
inline memory writes, and it may assume that BLOCK cannot be a null
|
pointer.
|
|
*Portability Note:* This function first appeared in OpenBSD 5.5 and
|
has not been standardized. Other systems may provide the same
|
functionality under a different name, such as ‘explicit_memset’,
|
‘memset_s’, or ‘SecureZeroMemory’.
|
|
The GNU C Library declares this function in ‘string.h’, but on
|
other systems it may be in ‘strings.h’ instead.
|
|
|
File: libc.info, Node: strfry, Next: Trivial Encryption, Prev: Erasing Sensitive Data, Up: String and Array Utilities
|
|
5.12 strfry
|
===========
|
|
The function below addresses the perennial programming quandary: “How do
|
I take good data in string form and painlessly turn it into garbage?”
|
This is actually a fairly simple task for C programmers who do not use
|
the GNU C Library string functions, but for programs based on the GNU C
|
Library, the ‘strfry’ function is the preferred method for destroying
|
string data.
|
|
The prototype for this function is in ‘string.h’.
|
|
-- Function: char * strfry (char *STRING)
|
Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety
|
Concepts::.
|
|
‘strfry’ creates a pseudorandom anagram of a string, replacing the
|
input with the anagram in place. For each position in the string,
|
‘strfry’ swaps it with a position in the string selected at random
|
(from a uniform distribution). The two positions may be the same.
|
|
The return value of ‘strfry’ is always STRING.
|
|
*Portability Note:* This function is unique to the GNU C Library.
|
|
|
File: libc.info, Node: Trivial Encryption, Next: Encode Binary Data, Prev: strfry, Up: String and Array Utilities
|
|
5.13 Trivial Encryption
|
=======================
|
|
The ‘memfrob’ function converts an array of data to something
|
unrecognizable and back again. It is not encryption in its usual sense
|
since it is easy for someone to convert the encrypted data back to clear
|
text. The transformation is analogous to Usenet’s “Rot13” encryption
|
method for obscuring offensive jokes from sensitive eyes and such.
|
Unlike Rot13, ‘memfrob’ works on arbitrary binary data, not just text.
|
|
For true encryption, *Note Cryptographic Functions::.
|
|
This function is declared in ‘string.h’.
|
|
-- Function: void * memfrob (void *MEM, size_t LENGTH)
|
Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety
|
Concepts::.
|
|
‘memfrob’ transforms (frobnicates) each byte of the data structure
|
at MEM, which is LENGTH bytes long, by bitwise exclusive oring it
|
with binary 00101010. It does the transformation in place and its
|
return value is always MEM.
|
|
Note that ‘memfrob’ a second time on the same data structure
|
returns it to its original state.
|
|
This is a good function for hiding information from someone who
|
doesn’t want to see it or doesn’t want to see it very much. To
|
really prevent people from retrieving the information, use stronger
|
encryption such as that described in *Note Cryptographic
|
Functions::.
|
|
*Portability Note:* This function is unique to the GNU C Library.
|
|
|
File: libc.info, Node: Encode Binary Data, Next: Argz and Envz Vectors, Prev: Trivial Encryption, Up: String and Array Utilities
|
|
5.14 Encode Binary Data
|
=======================
|
|
To store or transfer binary data in environments which only support text
|
one has to encode the binary data by mapping the input bytes to bytes in
|
the range allowed for storing or transferring. SVID systems (and
|
nowadays XPG compliant systems) provide minimal support for this task.
|
|
-- Function: char * l64a (long int N)
|
Preliminary: | MT-Unsafe race:l64a | AS-Unsafe | AC-Safe | *Note
|
POSIX Safety Concepts::.
|
|
This function encodes a 32-bit input value using bytes from the
|
basic character set. It returns a pointer to a 7 byte buffer which
|
contains an encoded version of N. To encode a series of bytes the
|
user must copy the returned string to a destination buffer. It
|
returns the empty string if N is zero, which is somewhat bizarre
|
but mandated by the standard.
|
*Warning:* Since a static buffer is used this function should not
|
be used in multi-threaded programs. There is no thread-safe
|
alternative to this function in the C library.
|
*Compatibility Note:* The XPG standard states that the return value
|
of ‘l64a’ is undefined if N is negative. In the GNU
|
implementation, ‘l64a’ treats its argument as unsigned, so it will
|
return a sensible encoding for any nonzero N; however, portable
|
programs should not rely on this.
|
|
To encode a large buffer ‘l64a’ must be called in a loop, once for
|
each 32-bit word of the buffer. For example, one could do
|
something like this:
|
|
char *
|
encode (const void *buf, size_t len)
|
{
|
/* We know in advance how long the buffer has to be. */
|
unsigned char *in = (unsigned char *) buf;
|
char *out = malloc (6 + ((len + 3) / 4) * 6 + 1);
|
char *cp = out, *p;
|
|
/* Encode the length. */
|
/* Using ‘htonl’ is necessary so that the data can be
|
decoded even on machines with different byte order.
|
‘l64a’ can return a string shorter than 6 bytes, so
|
we pad it with encoding of 0 ('.') at the end by
|
hand. */
|
|
p = stpcpy (cp, l64a (htonl (len)));
|
cp = mempcpy (p, "......", 6 - (p - cp));
|
|
while (len > 3)
|
{
|
unsigned long int n = *in++;
|
n = (n << 8) | *in++;
|
n = (n << 8) | *in++;
|
n = (n << 8) | *in++;
|
len -= 4;
|
p = stpcpy (cp, l64a (htonl (n)));
|
cp = mempcpy (p, "......", 6 - (p - cp));
|
}
|
if (len > 0)
|
{
|
unsigned long int n = *in++;
|
if (--len > 0)
|
{
|
n = (n << 8) | *in++;
|
if (--len > 0)
|
n = (n << 8) | *in;
|
}
|
cp = stpcpy (cp, l64a (htonl (n)));
|
}
|
*cp = '\0';
|
return out;
|
}
|
|
It is strange that the library does not provide the complete
|
functionality needed but so be it.
|
|
To decode data produced with ‘l64a’ the following function should be
|
used.
|
|
-- Function: long int a64l (const char *STRING)
|
Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety
|
Concepts::.
|
|
The parameter STRING should contain a string which was produced by
|
a call to ‘l64a’. The function processes at least 6 bytes of this
|
string, and decodes the bytes it finds according to the table
|
below. It stops decoding when it finds a byte not in the table,
|
rather like ‘atoi’; if you have a buffer which has been broken into
|
lines, you must be careful to skip over the end-of-line bytes.
|
|
The decoded number is returned as a ‘long int’ value.
|
|
The ‘l64a’ and ‘a64l’ functions use a base 64 encoding, in which each
|
byte of an encoded string represents six bits of an input word. These
|
symbols are used for the base 64 digits:
|
|
0 1 2 3 4 5 6 7
|
0 ‘.’ ‘/’ ‘0’ ‘1’ ‘2’ ‘3’ ‘4’ ‘5’
|
8 ‘6’ ‘7’ ‘8’ ‘9’ ‘A’ ‘B’ ‘C’ ‘D’
|
16 ‘E’ ‘F’ ‘G’ ‘H’ ‘I’ ‘J’ ‘K’ ‘L’
|
24 ‘M’ ‘N’ ‘O’ ‘P’ ‘Q’ ‘R’ ‘S’ ‘T’
|
32 ‘U’ ‘V’ ‘W’ ‘X’ ‘Y’ ‘Z’ ‘a’ ‘b’
|
40 ‘c’ ‘d’ ‘e’ ‘f’ ‘g’ ‘h’ ‘i’ ‘j’
|
48 ‘k’ ‘l’ ‘m’ ‘n’ ‘o’ ‘p’ ‘q’ ‘r’
|
56 ‘s’ ‘t’ ‘u’ ‘v’ ‘w’ ‘x’ ‘y’ ‘z’
|
|
This encoding scheme is not standard. There are some other encoding
|
methods which are much more widely used (UU encoding, MIME encoding).
|
Generally, it is better to use one of these encodings.
|
|
|
File: libc.info, Node: Argz and Envz Vectors, Prev: Encode Binary Data, Up: String and Array Utilities
|
|
5.15 Argz and Envz Vectors
|
==========================
|
|
"argz vectors" are vectors of strings in a contiguous block of memory,
|
each element separated from its neighbors by null bytes (‘'\0'’).
|
|
"Envz vectors" are an extension of argz vectors where each element is
|
a name-value pair, separated by a ‘'='’ byte (as in a Unix environment).
|
|
* Menu:
|
|
* Argz Functions:: Operations on argz vectors.
|
* Envz Functions:: Additional operations on environment vectors.
|
|
|
File: libc.info, Node: Argz Functions, Next: Envz Functions, Up: Argz and Envz Vectors
|
|
5.15.1 Argz Functions
|
---------------------
|
|
Each argz vector is represented by a pointer to the first element, of
|
type ‘char *’, and a size, of type ‘size_t’, both of which can be
|
initialized to ‘0’ to represent an empty argz vector. All argz
|
functions accept either a pointer and a size argument, or pointers to
|
them, if they will be modified.
|
|
The argz functions use ‘malloc’/‘realloc’ to allocate/grow argz
|
vectors, and so any argz vector created using these functions may be
|
freed by using ‘free’; conversely, any argz function that may grow a
|
string expects that string to have been allocated using ‘malloc’ (those
|
argz functions that only examine their arguments or modify them in place
|
will work on any sort of memory). *Note Unconstrained Allocation::.
|
|
All argz functions that do memory allocation have a return type of
|
‘error_t’, and return ‘0’ for success, and ‘ENOMEM’ if an allocation
|
error occurs.
|
|
These functions are declared in the standard include file ‘argz.h’.
|
|
-- Function: error_t argz_create (char *const ARGV[], char **ARGZ,
|
size_t *ARGZ_LEN)
|
Preliminary: | MT-Safe | AS-Unsafe heap | AC-Unsafe mem | *Note
|
POSIX Safety Concepts::.
|
|
The ‘argz_create’ function converts the Unix-style argument vector
|
ARGV (a vector of pointers to normal C strings, terminated by
|
‘(char *)0’; *note Program Arguments::) into an argz vector with
|
the same elements, which is returned in ARGZ and ARGZ_LEN.
|
|
-- Function: error_t argz_create_sep (const char *STRING, int SEP, char
|
**ARGZ, size_t *ARGZ_LEN)
|
Preliminary: | MT-Safe | AS-Unsafe heap | AC-Unsafe mem | *Note
|
POSIX Safety Concepts::.
|
|
The ‘argz_create_sep’ function converts the string STRING into an
|
argz vector (returned in ARGZ and ARGZ_LEN) by splitting it into
|
elements at every occurrence of the byte SEP.
|
|
-- Function: size_t argz_count (const char *ARGZ, size_t ARGZ_LEN)
|
Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety
|
Concepts::.
|
|
Returns the number of elements in the argz vector ARGZ and
|
ARGZ_LEN.
|
|
-- Function: void argz_extract (const char *ARGZ, size_t ARGZ_LEN, char
|
**ARGV)
|
Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety
|
Concepts::.
|
|
The ‘argz_extract’ function converts the argz vector ARGZ and
|
ARGZ_LEN into a Unix-style argument vector stored in ARGV, by
|
putting pointers to every element in ARGZ into successive positions
|
in ARGV, followed by a terminator of ‘0’. ARGV must be
|
pre-allocated with enough space to hold all the elements in ARGZ
|
plus the terminating ‘(char *)0’ (‘(argz_count (ARGZ, ARGZ_LEN) +
|
1) * sizeof (char *)’ bytes should be enough). Note that the
|
string pointers stored into ARGV point into ARGZ—they are not
|
copies—and so ARGZ must be copied if it will be changed while ARGV
|
is still active. This function is useful for passing the elements
|
in ARGZ to an exec function (*note Executing a File::).
|
|
-- Function: void argz_stringify (char *ARGZ, size_t LEN, int SEP)
|
Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety
|
Concepts::.
|
|
The ‘argz_stringify’ converts ARGZ into a normal string with the
|
elements separated by the byte SEP, by replacing each ‘'\0'’ inside
|
ARGZ (except the last one, which terminates the string) with SEP.
|
This is handy for printing ARGZ in a readable manner.
|
|
-- Function: error_t argz_add (char **ARGZ, size_t *ARGZ_LEN, const
|
char *STR)
|
Preliminary: | MT-Safe | AS-Unsafe heap | AC-Unsafe mem | *Note
|
POSIX Safety Concepts::.
|
|
The ‘argz_add’ function adds the string STR to the end of the argz
|
vector ‘*ARGZ’, and updates ‘*ARGZ’ and ‘*ARGZ_LEN’ accordingly.
|
|
-- Function: error_t argz_add_sep (char **ARGZ, size_t *ARGZ_LEN, const
|
char *STR, int DELIM)
|
Preliminary: | MT-Safe | AS-Unsafe heap | AC-Unsafe mem | *Note
|
POSIX Safety Concepts::.
|
|
The ‘argz_add_sep’ function is similar to ‘argz_add’, but STR is
|
split into separate elements in the result at occurrences of the
|
byte DELIM. This is useful, for instance, for adding the
|
components of a Unix search path to an argz vector, by using a
|
value of ‘':'’ for DELIM.
|
|
-- Function: error_t argz_append (char **ARGZ, size_t *ARGZ_LEN, const
|
char *BUF, size_t BUF_LEN)
|
Preliminary: | MT-Safe | AS-Unsafe heap | AC-Unsafe mem | *Note
|
POSIX Safety Concepts::.
|
|
The ‘argz_append’ function appends BUF_LEN bytes starting at BUF to
|
the argz vector ‘*ARGZ’, reallocating ‘*ARGZ’ to accommodate it,
|
and adding BUF_LEN to ‘*ARGZ_LEN’.
|
|
-- Function: void argz_delete (char **ARGZ, size_t *ARGZ_LEN, char
|
*ENTRY)
|
Preliminary: | MT-Safe | AS-Unsafe heap | AC-Unsafe mem | *Note
|
POSIX Safety Concepts::.
|
|
If ENTRY points to the beginning of one of the elements in the argz
|
vector ‘*ARGZ’, the ‘argz_delete’ function will remove this entry
|
and reallocate ‘*ARGZ’, modifying ‘*ARGZ’ and ‘*ARGZ_LEN’
|
accordingly. Note that as destructive argz functions usually
|
reallocate their argz argument, pointers into argz vectors such as
|
ENTRY will then become invalid.
|
|
-- Function: error_t argz_insert (char **ARGZ, size_t *ARGZ_LEN, char
|
*BEFORE, const char *ENTRY)
|
Preliminary: | MT-Safe | AS-Unsafe heap | AC-Unsafe mem | *Note
|
POSIX Safety Concepts::.
|
|
The ‘argz_insert’ function inserts the string ENTRY into the argz
|
vector ‘*ARGZ’ at a point just before the existing element pointed
|
to by BEFORE, reallocating ‘*ARGZ’ and updating ‘*ARGZ’ and
|
‘*ARGZ_LEN’. If BEFORE is ‘0’, ENTRY is added to the end instead
|
(as if by ‘argz_add’). Since the first element is in fact the same
|
as ‘*ARGZ’, passing in ‘*ARGZ’ as the value of BEFORE will result
|
in ENTRY being inserted at the beginning.
|
|
-- Function: char * argz_next (const char *ARGZ, size_t ARGZ_LEN, const
|
char *ENTRY)
|
Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety
|
Concepts::.
|
|
The ‘argz_next’ function provides a convenient way of iterating
|
over the elements in the argz vector ARGZ. It returns a pointer to
|
the next element in ARGZ after the element ENTRY, or ‘0’ if there
|
are no elements following ENTRY. If ENTRY is ‘0’, the first
|
element of ARGZ is returned.
|
|
This behavior suggests two styles of iteration:
|
|
char *entry = 0;
|
while ((entry = argz_next (ARGZ, ARGZ_LEN, entry)))
|
ACTION;
|
|
(the double parentheses are necessary to make some C compilers shut
|
up about what they consider a questionable ‘while’-test) and:
|
|
char *entry;
|
for (entry = ARGZ;
|
entry;
|
entry = argz_next (ARGZ, ARGZ_LEN, entry))
|
ACTION;
|
|
Note that the latter depends on ARGZ having a value of ‘0’ if it is
|
empty (rather than a pointer to an empty block of memory); this
|
invariant is maintained for argz vectors created by the functions
|
here.
|
|
-- Function: error_t argz_replace (char **ARGZ, size_t *ARGZ_LEN,
|
const char *STR, const char *WITH, unsigned *REPLACE_COUNT)
|
Preliminary: | MT-Safe | AS-Unsafe heap | AC-Unsafe mem | *Note
|
POSIX Safety Concepts::.
|
|
Replace any occurrences of the string STR in ARGZ with WITH,
|
reallocating ARGZ as necessary. If REPLACE_COUNT is non-zero,
|
‘*REPLACE_COUNT’ will be incremented by the number of replacements
|
performed.
|
|
|
File: libc.info, Node: Envz Functions, Prev: Argz Functions, Up: Argz and Envz Vectors
|
|
5.15.2 Envz Functions
|
---------------------
|
|
Envz vectors are just argz vectors with additional constraints on the
|
form of each element; as such, argz functions can also be used on them,
|
where it makes sense.
|
|
Each element in an envz vector is a name-value pair, separated by a
|
‘'='’ byte; if multiple ‘'='’ bytes are present in an element, those
|
after the first are considered part of the value, and treated like all
|
other non-‘'\0'’ bytes.
|
|
If _no_ ‘'='’ bytes are present in an element, that element is
|
considered the name of a “null” entry, as distinct from an entry with an
|
empty value: ‘envz_get’ will return ‘0’ if given the name of null entry,
|
whereas an entry with an empty value would result in a value of ‘""’;
|
‘envz_entry’ will still find such entries, however. Null entries can be
|
removed with the ‘envz_strip’ function.
|
|
As with argz functions, envz functions that may allocate memory (and
|
thus fail) have a return type of ‘error_t’, and return either ‘0’ or
|
‘ENOMEM’.
|
|
These functions are declared in the standard include file ‘envz.h’.
|
|
-- Function: char * envz_entry (const char *ENVZ, size_t ENVZ_LEN,
|
const char *NAME)
|
Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety
|
Concepts::.
|
|
The ‘envz_entry’ function finds the entry in ENVZ with the name
|
NAME, and returns a pointer to the whole entry—that is, the argz
|
element which begins with NAME followed by a ‘'='’ byte. If there
|
is no entry with that name, ‘0’ is returned.
|
|
-- Function: char * envz_get (const char *ENVZ, size_t ENVZ_LEN, const
|
char *NAME)
|
Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety
|
Concepts::.
|
|
The ‘envz_get’ function finds the entry in ENVZ with the name NAME
|
(like ‘envz_entry’), and returns a pointer to the value portion of
|
that entry (following the ‘'='’). If there is no entry with that
|
name (or only a null entry), ‘0’ is returned.
|
|
-- Function: error_t envz_add (char **ENVZ, size_t *ENVZ_LEN, const
|
char *NAME, const char *VALUE)
|
Preliminary: | MT-Safe | AS-Unsafe heap | AC-Unsafe mem | *Note
|
POSIX Safety Concepts::.
|
|
The ‘envz_add’ function adds an entry to ‘*ENVZ’ (updating ‘*ENVZ’
|
and ‘*ENVZ_LEN’) with the name NAME, and value VALUE. If an entry
|
with the same name already exists in ENVZ, it is removed first. If
|
VALUE is ‘0’, then the new entry will be the special null type of
|
entry (mentioned above).
|
|
-- Function: error_t envz_merge (char **ENVZ, size_t *ENVZ_LEN, const
|
char *ENVZ2, size_t ENVZ2_LEN, int OVERRIDE)
|
Preliminary: | MT-Safe | AS-Unsafe heap | AC-Unsafe mem | *Note
|
POSIX Safety Concepts::.
|
|
The ‘envz_merge’ function adds each entry in ENVZ2 to ENVZ, as if
|
with ‘envz_add’, updating ‘*ENVZ’ and ‘*ENVZ_LEN’. If OVERRIDE is
|
true, then values in ENVZ2 will supersede those with the same name
|
in ENVZ, otherwise not.
|
|
Null entries are treated just like other entries in this respect,
|
so a null entry in ENVZ can prevent an entry of the same name in
|
ENVZ2 from being added to ENVZ, if OVERRIDE is false.
|
|
-- Function: void envz_strip (char **ENVZ, size_t *ENVZ_LEN)
|
Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety
|
Concepts::.
|
|
The ‘envz_strip’ function removes any null entries from ENVZ,
|
updating ‘*ENVZ’ and ‘*ENVZ_LEN’.
|
|
-- Function: void envz_remove (char **ENVZ, size_t *ENVZ_LEN, const
|
char *NAME)
|
Preliminary: | MT-Safe | AS-Unsafe heap | AC-Unsafe mem | *Note
|
POSIX Safety Concepts::.
|
|
The ‘envz_remove’ function removes an entry named NAME from ENVZ,
|
updating ‘*ENVZ’ and ‘*ENVZ_LEN’.
|
|
|
File: libc.info, Node: Character Set Handling, Next: Locales, Prev: String and Array Utilities, Up: Top
|
|
6 Character Set Handling
|
************************
|
|
Character sets used in the early days of computing had only six, seven,
|
or eight bits for each character: there was never a case where more than
|
eight bits (one byte) were used to represent a single character. The
|
limitations of this approach became more apparent as more people
|
grappled with non-Roman character sets, where not all the characters
|
that make up a language’s character set can be represented by 2^8
|
choices. This chapter shows the functionality that was added to the C
|
library to support multiple character sets.
|
|
* Menu:
|
|
* Extended Char Intro:: Introduction to Extended Characters.
|
* Charset Function Overview:: Overview about Character Handling
|
Functions.
|
* Restartable multibyte conversion:: Restartable multibyte conversion
|
Functions.
|
* Non-reentrant Conversion:: Non-reentrant Conversion Function.
|
* Generic Charset Conversion:: Generic Charset Conversion.
|
|
|
File: libc.info, Node: Extended Char Intro, Next: Charset Function Overview, Up: Character Set Handling
|
|
6.1 Introduction to Extended Characters
|
=======================================
|
|
A variety of solutions are available to overcome the differences between
|
character sets with a 1:1 relation between bytes and characters and
|
character sets with ratios of 2:1 or 4:1. The remainder of this section
|
gives a few examples to help understand the design decisions made while
|
developing the functionality of the C library.
|
|
A distinction we have to make right away is between internal and
|
external representation. "Internal representation" means the
|
representation used by a program while keeping the text in memory.
|
External representations are used when text is stored or transmitted
|
through some communication channel. Examples of external
|
representations include files waiting in a directory to be read and
|
parsed.
|
|
Traditionally there has been no difference between the two
|
representations. It was equally comfortable and useful to use the same
|
single-byte representation internally and externally. This comfort
|
level decreases with more and larger character sets.
|
|
One of the problems to overcome with the internal representation is
|
handling text that is externally encoded using different character sets.
|
Assume a program that reads two texts and compares them using some
|
metric. The comparison can be usefully done only if the texts are
|
internally kept in a common format.
|
|
For such a common format (= character set) eight bits are certainly
|
no longer enough. So the smallest entity will have to grow: "wide
|
characters" will now be used. Instead of one byte per character, two or
|
four will be used instead. (Three are not good to address in memory and
|
more than four bytes seem not to be necessary).
|
|
As shown in some other part of this manual, a completely new family
|
has been created of functions that can handle wide character texts in
|
memory. The most commonly used character sets for such internal wide
|
character representations are Unicode and ISO 10646 (also known as UCS
|
for Universal Character Set). Unicode was originally planned as a
|
16-bit character set; whereas, ISO 10646 was designed to be a 31-bit
|
large code space. The two standards are practically identical. They
|
have the same character repertoire and code table, but Unicode specifies
|
added semantics. At the moment, only characters in the first ‘0x10000’
|
code positions (the so-called Basic Multilingual Plane, BMP) have been
|
assigned, but the assignment of more specialized characters outside this
|
16-bit space is already in progress. A number of encodings have been
|
defined for Unicode and ISO 10646 characters: UCS-2 is a 16-bit word
|
that can only represent characters from the BMP, UCS-4 is a 32-bit word
|
than can represent any Unicode and ISO 10646 character, UTF-8 is an
|
ASCII compatible encoding where ASCII characters are represented by
|
ASCII bytes and non-ASCII characters by sequences of 2-6 non-ASCII
|
bytes, and finally UTF-16 is an extension of UCS-2 in which pairs of
|
certain UCS-2 words can be used to encode non-BMP characters up to
|
‘0x10ffff’.
|
|
To represent wide characters the ‘char’ type is not suitable. For
|
this reason the ISO C standard introduces a new type that is designed to
|
keep one character of a wide character string. To maintain the
|
similarity there is also a type corresponding to ‘int’ for those
|
functions that take a single wide character.
|
|
-- Data type: wchar_t
|
This data type is used as the base type for wide character strings.
|
In other words, arrays of objects of this type are the equivalent
|
of ‘char[]’ for multibyte character strings. The type is defined
|
in ‘stddef.h’.
|
|
The ISO C90 standard, where ‘wchar_t’ was introduced, does not say
|
anything specific about the representation. It only requires that
|
this type is capable of storing all elements of the basic character
|
set. Therefore it would be legitimate to define ‘wchar_t’ as
|
‘char’, which might make sense for embedded systems.
|
|
But in the GNU C Library ‘wchar_t’ is always 32 bits wide and,
|
therefore, capable of representing all UCS-4 values and, therefore,
|
covering all of ISO 10646. Some Unix systems define ‘wchar_t’ as a
|
16-bit type and thereby follow Unicode very strictly. This
|
definition is perfectly fine with the standard, but it also means
|
that to represent all characters from Unicode and ISO 10646 one has
|
to use UTF-16 surrogate characters, which is in fact a
|
multi-wide-character encoding. But resorting to
|
multi-wide-character encoding contradicts the purpose of the
|
‘wchar_t’ type.
|
|
-- Data type: wint_t
|
‘wint_t’ is a data type used for parameters and variables that
|
contain a single wide character. As the name suggests this type is
|
the equivalent of ‘int’ when using the normal ‘char’ strings. The
|
types ‘wchar_t’ and ‘wint_t’ often have the same representation if
|
their size is 32 bits wide but if ‘wchar_t’ is defined as ‘char’
|
the type ‘wint_t’ must be defined as ‘int’ due to the parameter
|
promotion.
|
|
This type is defined in ‘wchar.h’ and was introduced in Amendment 1
|
to ISO C90.
|
|
As there are for the ‘char’ data type macros are available for
|
specifying the minimum and maximum value representable in an object of
|
type ‘wchar_t’.
|
|
-- Macro: wint_t WCHAR_MIN
|
The macro ‘WCHAR_MIN’ evaluates to the minimum value representable
|
by an object of type ‘wint_t’.
|
|
This macro was introduced in Amendment 1 to ISO C90.
|
|
-- Macro: wint_t WCHAR_MAX
|
The macro ‘WCHAR_MAX’ evaluates to the maximum value representable
|
by an object of type ‘wint_t’.
|
|
This macro was introduced in Amendment 1 to ISO C90.
|
|
Another special wide character value is the equivalent to ‘EOF’.
|
|
-- Macro: wint_t WEOF
|
The macro ‘WEOF’ evaluates to a constant expression of type
|
‘wint_t’ whose value is different from any member of the extended
|
character set.
|
|
‘WEOF’ need not be the same value as ‘EOF’ and unlike ‘EOF’ it also
|
need _not_ be negative. In other words, sloppy code like
|
|
{
|
int c;
|
…
|
while ((c = getc (fp)) < 0)
|
…
|
}
|
|
has to be rewritten to use ‘WEOF’ explicitly when wide characters
|
are used:
|
|
{
|
wint_t c;
|
…
|
while ((c = wgetc (fp)) != WEOF)
|
…
|
}
|
|
This macro was introduced in Amendment 1 to ISO C90 and is defined
|
in ‘wchar.h’.
|
|
These internal representations present problems when it comes to
|
storage and transmittal. Because each single wide character consists of
|
more than one byte, they are affected by byte-ordering. Thus, machines
|
with different endianesses would see different values when accessing the
|
same data. This byte ordering concern also applies for communication
|
protocols that are all byte-based and therefore require that the sender
|
has to decide about splitting the wide character in bytes. A last (but
|
not least important) point is that wide characters often require more
|
storage space than a customized byte-oriented character set.
|
|
For all the above reasons, an external encoding that is different
|
from the internal encoding is often used if the latter is UCS-2 or
|
UCS-4. The external encoding is byte-based and can be chosen
|
appropriately for the environment and for the texts to be handled. A
|
variety of different character sets can be used for this external
|
encoding (information that will not be exhaustively presented
|
here–instead, a description of the major groups will suffice). All of
|
the ASCII-based character sets fulfill one requirement: they are
|
"filesystem safe." This means that the character ‘'/'’ is used in the
|
encoding _only_ to represent itself. Things are a bit different for
|
character sets like EBCDIC (Extended Binary Coded Decimal Interchange
|
Code, a character set family used by IBM), but if the operating system
|
does not understand EBCDIC directly the parameters-to-system calls have
|
to be converted first anyhow.
|
|
• The simplest character sets are single-byte character sets. There
|
can be only up to 256 characters (for 8 bit character sets), which
|
is not sufficient to cover all languages but might be sufficient to
|
handle a specific text. Handling of a 8 bit character sets is
|
simple. This is not true for other kinds presented later, and
|
therefore, the application one uses might require the use of 8 bit
|
character sets.
|
|
• The ISO 2022 standard defines a mechanism for extended character
|
sets where one character _can_ be represented by more than one
|
byte. This is achieved by associating a state with the text.
|
Characters that can be used to change the state can be embedded in
|
the text. Each byte in the text might have a different
|
interpretation in each state. The state might even influence
|
whether a given byte stands for a character on its own or whether
|
it has to be combined with some more bytes.
|
|
In most uses of ISO 2022 the defined character sets do not allow
|
state changes that cover more than the next character. This has
|
the big advantage that whenever one can identify the beginning of
|
the byte sequence of a character one can interpret a text
|
correctly. Examples of character sets using this policy are the
|
various EUC character sets (used by Sun’s operating systems,
|
EUC-JP, EUC-KR, EUC-TW, and EUC-CN) or Shift_JIS (SJIS, a Japanese
|
encoding).
|
|
But there are also character sets using a state that is valid for
|
more than one character and has to be changed by another byte
|
sequence. Examples for this are ISO-2022-JP, ISO-2022-KR, and
|
ISO-2022-CN.
|
|
• Early attempts to fix 8 bit character sets for other languages
|
using the Roman alphabet lead to character sets like ISO 6937.
|
Here bytes representing characters like the acute accent do not
|
produce output themselves: one has to combine them with other
|
characters to get the desired result. For example, the byte
|
sequence ‘0xc2 0x61’ (non-spacing acute accent, followed by
|
lower-case ‘a’) to get the “small a with acute” character. To get
|
the acute accent character on its own, one has to write ‘0xc2 0x20’
|
(the non-spacing acute followed by a space).
|
|
Character sets like ISO 6937 are used in some embedded systems such
|
as teletex.
|
|
• Instead of converting the Unicode or ISO 10646 text used
|
internally, it is often also sufficient to simply use an encoding
|
different than UCS-2/UCS-4. The Unicode and ISO 10646 standards
|
even specify such an encoding: UTF-8. This encoding is able to
|
represent all of ISO 10646 31 bits in a byte string of length one
|
to six.
|
|
There were a few other attempts to encode ISO 10646 such as UTF-7,
|
but UTF-8 is today the only encoding that should be used. In fact,
|
with any luck UTF-8 will soon be the only external encoding that
|
has to be supported. It proves to be universally usable and its
|
only disadvantage is that it favors Roman languages by making the
|
byte string representation of other scripts (Cyrillic, Greek, Asian
|
scripts) longer than necessary if using a specific character set
|
for these scripts. Methods like the Unicode compression scheme can
|
alleviate these problems.
|
|
The question remaining is: how to select the character set or
|
encoding to use. The answer: you cannot decide about it yourself, it is
|
decided by the developers of the system or the majority of the users.
|
Since the goal is interoperability one has to use whatever the other
|
people one works with use. If there are no constraints, the selection
|
is based on the requirements the expected circle of users will have. In
|
other words, if a project is expected to be used in only, say, Russia it
|
is fine to use KOI8-R or a similar character set. But if at the same
|
time people from, say, Greece are participating one should use a
|
character set that allows all people to collaborate.
|
|
The most widely useful solution seems to be: go with the most general
|
character set, namely ISO 10646. Use UTF-8 as the external encoding and
|
problems about users not being able to use their own language adequately
|
are a thing of the past.
|
|
One final comment about the choice of the wide character
|
representation is necessary at this point. We have said above that the
|
natural choice is using Unicode or ISO 10646. This is not required, but
|
at least encouraged, by the ISO C standard. The standard defines at
|
least a macro ‘__STDC_ISO_10646__’ that is only defined on systems where
|
the ‘wchar_t’ type encodes ISO 10646 characters. If this symbol is not
|
defined one should avoid making assumptions about the wide character
|
representation. If the programmer uses only the functions provided by
|
the C library to handle wide character strings there should be no
|
compatibility problems with other systems.
|
|
|
File: libc.info, Node: Charset Function Overview, Next: Restartable multibyte conversion, Prev: Extended Char Intro, Up: Character Set Handling
|
|
6.2 Overview about Character Handling Functions
|
===============================================
|
|
A Unix C library contains three different sets of functions in two
|
families to handle character set conversion. One of the function
|
families (the most commonly used) is specified in the ISO C90 standard
|
and, therefore, is portable even beyond the Unix world. Unfortunately
|
this family is the least useful one. These functions should be avoided
|
whenever possible, especially when developing libraries (as opposed to
|
applications).
|
|
The second family of functions got introduced in the early Unix
|
standards (XPG2) and is still part of the latest and greatest Unix
|
standard: Unix 98. It is also the most powerful and useful set of
|
functions. But we will start with the functions defined in Amendment 1
|
to ISO C90.
|
|
|
File: libc.info, Node: Restartable multibyte conversion, Next: Non-reentrant Conversion, Prev: Charset Function Overview, Up: Character Set Handling
|
|
6.3 Restartable Multibyte Conversion Functions
|
==============================================
|
|
The ISO C standard defines functions to convert strings from a multibyte
|
representation to wide character strings. There are a number of
|
peculiarities:
|
|
• The character set assumed for the multibyte encoding is not
|
specified as an argument to the functions. Instead the character
|
set specified by the ‘LC_CTYPE’ category of the current locale is
|
used; see *note Locale Categories::.
|
|
• The functions handling more than one character at a time require
|
NUL terminated strings as the argument (i.e., converting blocks of
|
text does not work unless one can add a NUL byte at an appropriate
|
place). The GNU C Library contains some extensions to the standard
|
that allow specifying a size, but basically they also expect
|
terminated strings.
|
|
Despite these limitations the ISO C functions can be used in many
|
contexts. In graphical user interfaces, for instance, it is not
|
uncommon to have functions that require text to be displayed in a wide
|
character string if the text is not simple ASCII. The text itself might
|
come from a file with translations and the user should decide about the
|
current locale, which determines the translation and therefore also the
|
external encoding used. In such a situation (and many others) the
|
functions described here are perfect. If more freedom while performing
|
the conversion is necessary take a look at the ‘iconv’ functions (*note
|
Generic Charset Conversion::).
|
|
* Menu:
|
|
* Selecting the Conversion:: Selecting the conversion and its properties.
|
* Keeping the state:: Representing the state of the conversion.
|
* Converting a Character:: Converting Single Characters.
|
* Converting Strings:: Converting Multibyte and Wide Character
|
Strings.
|
* Multibyte Conversion Example:: A Complete Multibyte Conversion Example.
|
|
|
File: libc.info, Node: Selecting the Conversion, Next: Keeping the state, Up: Restartable multibyte conversion
|
|
6.3.1 Selecting the conversion and its properties
|
-------------------------------------------------
|
|
We already said above that the currently selected locale for the
|
‘LC_CTYPE’ category decides the conversion that is performed by the
|
functions we are about to describe. Each locale uses its own character
|
set (given as an argument to ‘localedef’) and this is the one assumed as
|
the external multibyte encoding. The wide character set is always UCS-4
|
in the GNU C Library.
|
|
A characteristic of each multibyte character set is the maximum
|
number of bytes that can be necessary to represent one character. This
|
information is quite important when writing code that uses the
|
conversion functions (as shown in the examples below). The ISO C
|
standard defines two macros that provide this information.
|
|
-- Macro: int MB_LEN_MAX
|
‘MB_LEN_MAX’ specifies the maximum number of bytes in the multibyte
|
sequence for a single character in any of the supported locales.
|
It is a compile-time constant and is defined in ‘limits.h’.
|
|
-- Macro: int MB_CUR_MAX
|
‘MB_CUR_MAX’ expands into a positive integer expression that is the
|
maximum number of bytes in a multibyte character in the current
|
locale. The value is never greater than ‘MB_LEN_MAX’. Unlike
|
‘MB_LEN_MAX’ this macro need not be a compile-time constant, and in
|
the GNU C Library it is not.
|
|
‘MB_CUR_MAX’ is defined in ‘stdlib.h’.
|
|
Two different macros are necessary since strictly ISO C90 compilers
|
do not allow variable length array definitions, but still it is
|
desirable to avoid dynamic allocation. This incomplete piece of code
|
shows the problem:
|
|
{
|
char buf[MB_LEN_MAX];
|
ssize_t len = 0;
|
|
while (! feof (fp))
|
{
|
fread (&buf[len], 1, MB_CUR_MAX - len, fp);
|
/* … process buf */
|
len -= used;
|
}
|
}
|
|
The code in the inner loop is expected to have always enough bytes in
|
the array BUF to convert one multibyte character. The array BUF has to
|
be sized statically since many compilers do not allow a variable size.
|
The ‘fread’ call makes sure that ‘MB_CUR_MAX’ bytes are always available
|
in BUF. Note that it isn’t a problem if ‘MB_CUR_MAX’ is not a
|
compile-time constant.
|
|
|
File: libc.info, Node: Keeping the state, Next: Converting a Character, Prev: Selecting the Conversion, Up: Restartable multibyte conversion
|
|
6.3.2 Representing the state of the conversion
|
----------------------------------------------
|
|
In the introduction of this chapter it was said that certain character
|
sets use a "stateful" encoding. That is, the encoded values depend in
|
some way on the previous bytes in the text.
|
|
Since the conversion functions allow converting a text in more than
|
one step we must have a way to pass this information from one call of
|
the functions to another.
|
|
-- Data type: mbstate_t
|
A variable of type ‘mbstate_t’ can contain all the information
|
about the "shift state" needed from one call to a conversion
|
function to another.
|
|
‘mbstate_t’ is defined in ‘wchar.h’. It was introduced in Amendment 1
|
to ISO C90.
|
|
To use objects of type ‘mbstate_t’ the programmer has to define such
|
objects (normally as local variables on the stack) and pass a pointer to
|
the object to the conversion functions. This way the conversion
|
function can update the object if the current multibyte character set is
|
stateful.
|
|
There is no specific function or initializer to put the state object
|
in any specific state. The rules are that the object should always
|
represent the initial state before the first use, and this is achieved
|
by clearing the whole variable with code such as follows:
|
|
{
|
mbstate_t state;
|
memset (&state, '\0', sizeof (state));
|
/* from now on STATE can be used. */
|
…
|
}
|
|
When using the conversion functions to generate output it is often
|
necessary to test whether the current state corresponds to the initial
|
state. This is necessary, for example, to decide whether to emit escape
|
sequences to set the state to the initial state at certain sequence
|
points. Communication protocols often require this.
|
|
-- Function: int mbsinit (const mbstate_t *PS)
|
Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety
|
Concepts::.
|
|
The ‘mbsinit’ function determines whether the state object pointed
|
to by PS is in the initial state. If PS is a null pointer or the
|
object is in the initial state the return value is nonzero.
|
Otherwise it is zero.
|
|
‘mbsinit’ was introduced in Amendment 1 to ISO C90 and is declared
|
in ‘wchar.h’.
|
|
Code using ‘mbsinit’ often looks similar to this:
|
|
{
|
mbstate_t state;
|
memset (&state, '\0', sizeof (state));
|
/* Use STATE. */
|
…
|
if (! mbsinit (&state))
|
{
|
/* Emit code to return to initial state. */
|
const wchar_t empty[] = L"";
|
const wchar_t *srcp = empty;
|
wcsrtombs (outbuf, &srcp, outbuflen, &state);
|
}
|
…
|
}
|
|
The code to emit the escape sequence to get back to the initial state
|
is interesting. The ‘wcsrtombs’ function can be used to determine the
|
necessary output code (*note Converting Strings::). Please note that
|
with the GNU C Library it is not necessary to perform this extra action
|
for the conversion from multibyte text to wide character text since the
|
wide character encoding is not stateful. But there is nothing mentioned
|
in any standard that prohibits making ‘wchar_t’ use a stateful encoding.
|
|
|
File: libc.info, Node: Converting a Character, Next: Converting Strings, Prev: Keeping the state, Up: Restartable multibyte conversion
|
|
6.3.3 Converting Single Characters
|
----------------------------------
|
|
The most fundamental of the conversion functions are those dealing with
|
single characters. Please note that this does not always mean single
|
bytes. But since there is very often a subset of the multibyte
|
character set that consists of single byte sequences, there are
|
functions to help with converting bytes. Frequently, ASCII is a subset
|
of the multibyte character set. In such a scenario, each ASCII
|
character stands for itself, and all other characters have at least a
|
first byte that is beyond the range 0 to 127.
|
|
-- Function: wint_t btowc (int C)
|
Preliminary: | MT-Safe | AS-Unsafe corrupt heap lock dlopen |
|
AC-Unsafe corrupt lock mem fd | *Note POSIX Safety Concepts::.
|
|
The ‘btowc’ function (“byte to wide character”) converts a valid
|
single byte character C in the initial shift state into the wide
|
character equivalent using the conversion rules from the currently
|
selected locale of the ‘LC_CTYPE’ category.
|
|
If ‘(unsigned char) C’ is no valid single byte multibyte character
|
or if C is ‘EOF’, the function returns ‘WEOF’.
|
|
Please note the restriction of C being tested for validity only in
|
the initial shift state. No ‘mbstate_t’ object is used from which
|
the state information is taken, and the function also does not use
|
any static state.
|
|
The ‘btowc’ function was introduced in Amendment 1 to ISO C90 and
|
is declared in ‘wchar.h’.
|
|
Despite the limitation that the single byte value is always
|
interpreted in the initial state, this function is actually useful most
|
of the time. Most characters are either entirely single-byte character
|
sets or they are extensions to ASCII. But then it is possible to write
|
code like this (not that this specific example is very useful):
|
|
wchar_t *
|
itow (unsigned long int val)
|
{
|
static wchar_t buf[30];
|
wchar_t *wcp = &buf[29];
|
*wcp = L'\0';
|
while (val != 0)
|
{
|
*--wcp = btowc ('0' + val % 10);
|
val /= 10;
|
}
|
if (wcp == &buf[29])
|
*--wcp = L'0';
|
return wcp;
|
}
|
|
Why is it necessary to use such a complicated implementation and not
|
simply cast ‘'0' + val % 10’ to a wide character? The answer is that
|
there is no guarantee that one can perform this kind of arithmetic on
|
the character of the character set used for ‘wchar_t’ representation.
|
In other situations the bytes are not constant at compile time and so
|
the compiler cannot do the work. In situations like this, using ‘btowc’
|
is required.
|
|
There is also a function for the conversion in the other direction.
|
|
-- Function: int wctob (wint_t C)
|
Preliminary: | MT-Safe | AS-Unsafe corrupt heap lock dlopen |
|
AC-Unsafe corrupt lock mem fd | *Note POSIX Safety Concepts::.
|
|
The ‘wctob’ function (“wide character to byte”) takes as the
|
parameter a valid wide character. If the multibyte representation
|
for this character in the initial state is exactly one byte long,
|
the return value of this function is this character. Otherwise the
|
return value is ‘EOF’.
|
|
‘wctob’ was introduced in Amendment 1 to ISO C90 and is declared in
|
‘wchar.h’.
|
|
There are more general functions to convert single characters from
|
multibyte representation to wide characters and vice versa. These
|
functions pose no limit on the length of the multibyte representation
|
and they also do not require it to be in the initial state.
|
|
-- Function: size_t mbrtowc (wchar_t *restrict PWC, const char
|
*restrict S, size_t N, mbstate_t *restrict PS)
|
Preliminary: | MT-Unsafe race:mbrtowc/!ps | AS-Unsafe corrupt heap
|
lock dlopen | AC-Unsafe corrupt lock mem fd | *Note POSIX Safety
|
Concepts::.
|
|
The ‘mbrtowc’ function (“multibyte restartable to wide character”)
|
converts the next multibyte character in the string pointed to by S
|
into a wide character and stores it in the wide character string
|
pointed to by PWC. The conversion is performed according to the
|
locale currently selected for the ‘LC_CTYPE’ category. If the
|
conversion for the character set used in the locale requires a
|
state, the multibyte string is interpreted in the state represented
|
by the object pointed to by PS. If PS is a null pointer, a static,
|
internal state variable used only by the ‘mbrtowc’ function is
|
used.
|
|
If the next multibyte character corresponds to the NUL wide
|
character, the return value of the function is 0 and the state
|
object is afterwards in the initial state. If the next N or fewer
|
bytes form a correct multibyte character, the return value is the
|
number of bytes starting from S that form the multibyte character.
|
The conversion state is updated according to the bytes consumed in
|
the conversion. In both cases the wide character (either the
|
‘L'\0'’ or the one found in the conversion) is stored in the string
|
pointed to by PWC if PWC is not null.
|
|
If the first N bytes of the multibyte string possibly form a valid
|
multibyte character but there are more than N bytes needed to
|
complete it, the return value of the function is ‘(size_t) -2’ and
|
no value is stored. Please note that this can happen even if N has
|
a value greater than or equal to ‘MB_CUR_MAX’ since the input might
|
contain redundant shift sequences.
|
|
If the first ‘n’ bytes of the multibyte string cannot possibly form
|
a valid multibyte character, no value is stored, the global
|
variable ‘errno’ is set to the value ‘EILSEQ’, and the function
|
returns ‘(size_t) -1’. The conversion state is afterwards
|
undefined.
|
|
‘mbrtowc’ was introduced in Amendment 1 to ISO C90 and is declared
|
in ‘wchar.h’.
|
|
Use of ‘mbrtowc’ is straightforward. A function that copies a
|
multibyte string into a wide character string while at the same time
|
converting all lowercase characters into uppercase could look like this
|
(this is not the final version, just an example; it has no error
|
checking, and sometimes leaks memory):
|
|
wchar_t *
|
mbstouwcs (const char *s)
|
{
|
size_t len = strlen (s);
|
wchar_t *result = malloc ((len + 1) * sizeof (wchar_t));
|
wchar_t *wcp = result;
|
wchar_t tmp[1];
|
mbstate_t state;
|
size_t nbytes;
|
|
memset (&state, '\0', sizeof (state));
|
while ((nbytes = mbrtowc (tmp, s, len, &state)) > 0)
|
{
|
if (nbytes >= (size_t) -2)
|
/* Invalid input string. */
|
return NULL;
|
*wcp++ = towupper (tmp[0]);
|
len -= nbytes;
|
s += nbytes;
|
}
|
return result;
|
}
|
|
The use of ‘mbrtowc’ should be clear. A single wide character is
|
stored in ‘TMP[0]’, and the number of consumed bytes is stored in the
|
variable NBYTES. If the conversion is successful, the uppercase variant
|
of the wide character is stored in the RESULT array and the pointer to
|
the input string and the number of available bytes is adjusted.
|
|
The only non-obvious thing about ‘mbrtowc’ might be the way memory is
|
allocated for the result. The above code uses the fact that there can
|
never be more wide characters in the converted result than there are
|
bytes in the multibyte input string. This method yields a pessimistic
|
guess about the size of the result, and if many wide character strings
|
have to be constructed this way or if the strings are long, the extra
|
memory required to be allocated because the input string contains
|
multibyte characters might be significant. The allocated memory block
|
can be resized to the correct size before returning it, but a better
|
solution might be to allocate just the right amount of space for the
|
result right away. Unfortunately there is no function to compute the
|
length of the wide character string directly from the multibyte string.
|
There is, however, a function that does part of the work.
|
|
-- Function: size_t mbrlen (const char *restrict S, size_t N, mbstate_t
|
*PS)
|
Preliminary: | MT-Unsafe race:mbrlen/!ps | AS-Unsafe corrupt heap
|
lock dlopen | AC-Unsafe corrupt lock mem fd | *Note POSIX Safety
|
Concepts::.
|
|
The ‘mbrlen’ function (“multibyte restartable length”) computes the
|
number of at most N bytes starting at S, which form the next valid
|
and complete multibyte character.
|
|
If the next multibyte character corresponds to the NUL wide
|
character, the return value is 0. If the next N bytes form a valid
|
multibyte character, the number of bytes belonging to this
|
multibyte character byte sequence is returned.
|
|
If the first N bytes possibly form a valid multibyte character but
|
the character is incomplete, the return value is ‘(size_t) -2’.
|
Otherwise the multibyte character sequence is invalid and the
|
return value is ‘(size_t) -1’.
|
|
The multibyte sequence is interpreted in the state represented by
|
the object pointed to by PS. If PS is a null pointer, a state
|
object local to ‘mbrlen’ is used.
|
|
‘mbrlen’ was introduced in Amendment 1 to ISO C90 and is declared
|
in ‘wchar.h’.
|
|
The attentive reader now will note that ‘mbrlen’ can be implemented
|
as
|
|
mbrtowc (NULL, s, n, ps != NULL ? ps : &internal)
|
|
This is true and in fact is mentioned in the official specification.
|
How can this function be used to determine the length of the wide
|
character string created from a multibyte character string? It is not
|
directly usable, but we can define a function ‘mbslen’ using it:
|
|
size_t
|
mbslen (const char *s)
|
{
|
mbstate_t state;
|
size_t result = 0;
|
size_t nbytes;
|
memset (&state, '\0', sizeof (state));
|
while ((nbytes = mbrlen (s, MB_LEN_MAX, &state)) > 0)
|
{
|
if (nbytes >= (size_t) -2)
|
/* Something is wrong. */
|
return (size_t) -1;
|
s += nbytes;
|
++result;
|
}
|
return result;
|
}
|
|
This function simply calls ‘mbrlen’ for each multibyte character in
|
the string and counts the number of function calls. Please note that we
|
here use ‘MB_LEN_MAX’ as the size argument in the ‘mbrlen’ call. This
|
is acceptable since a) this value is larger than the length of the
|
longest multibyte character sequence and b) we know that the string S
|
ends with a NUL byte, which cannot be part of any other multibyte
|
character sequence but the one representing the NUL wide character.
|
Therefore, the ‘mbrlen’ function will never read invalid memory.
|
|
Now that this function is available (just to make this clear, this
|
function is _not_ part of the GNU C Library) we can compute the number
|
of wide characters required to store the converted multibyte character
|
string S using
|
|
wcs_bytes = (mbslen (s) + 1) * sizeof (wchar_t);
|
|
Please note that the ‘mbslen’ function is quite inefficient. The
|
implementation of ‘mbstouwcs’ with ‘mbslen’ would have to perform the
|
conversion of the multibyte character input string twice, and this
|
conversion might be quite expensive. So it is necessary to think about
|
the consequences of using the easier but imprecise method before doing
|
the work twice.
|
|
-- Function: size_t wcrtomb (char *restrict S, wchar_t WC, mbstate_t
|
*restrict PS)
|
Preliminary: | MT-Unsafe race:wcrtomb/!ps | AS-Unsafe corrupt heap
|
lock dlopen | AC-Unsafe corrupt lock mem fd | *Note POSIX Safety
|
Concepts::.
|
|
The ‘wcrtomb’ function (“wide character restartable to multibyte”)
|
converts a single wide character into a multibyte string
|
corresponding to that wide character.
|
|
If S is a null pointer, the function resets the state stored in the
|
object pointed to by PS (or the internal ‘mbstate_t’ object) to the
|
initial state. This can also be achieved by a call like this:
|
|
wcrtombs (temp_buf, L'\0', ps)
|
|
since, if S is a null pointer, ‘wcrtomb’ performs as if it writes
|
into an internal buffer, which is guaranteed to be large enough.
|
|
If WC is the NUL wide character, ‘wcrtomb’ emits, if necessary, a
|
shift sequence to get the state PS into the initial state followed
|
by a single NUL byte, which is stored in the string S.
|
|
Otherwise a byte sequence (possibly including shift sequences) is
|
written into the string S. This only happens if WC is a valid wide
|
character (i.e., it has a multibyte representation in the character
|
set selected by locale of the ‘LC_CTYPE’ category). If WC is no
|
valid wide character, nothing is stored in the strings S, ‘errno’
|
is set to ‘EILSEQ’, the conversion state in PS is undefined and the
|
return value is ‘(size_t) -1’.
|
|
If no error occurred the function returns the number of bytes
|
stored in the string S. This includes all bytes representing shift
|
sequences.
|
|
One word about the interface of the function: there is no parameter
|
specifying the length of the array S. Instead the function assumes
|
that there are at least ‘MB_CUR_MAX’ bytes available since this is
|
the maximum length of any byte sequence representing a single
|
character. So the caller has to make sure that there is enough
|
space available, otherwise buffer overruns can occur.
|
|
‘wcrtomb’ was introduced in Amendment 1 to ISO C90 and is declared
|
in ‘wchar.h’.
|
|
Using ‘wcrtomb’ is as easy as using ‘mbrtowc’. The following example
|
appends a wide character string to a multibyte character string. Again,
|
the code is not really useful (or correct), it is simply here to
|
demonstrate the use and some problems.
|
|
char *
|
mbscatwcs (char *s, size_t len, const wchar_t *ws)
|
{
|
mbstate_t state;
|
/* Find the end of the existing string. */
|
char *wp = strchr (s, '\0');
|
len -= wp - s;
|
memset (&state, '\0', sizeof (state));
|
do
|
{
|
size_t nbytes;
|
if (len < MB_CUR_LEN)
|
{
|
/* We cannot guarantee that the next
|
character fits into the buffer, so
|
return an error. */
|
errno = E2BIG;
|
return NULL;
|
}
|
nbytes = wcrtomb (wp, *ws, &state);
|
if (nbytes == (size_t) -1)
|
/* Error in the conversion. */
|
return NULL;
|
len -= nbytes;
|
wp += nbytes;
|
}
|
while (*ws++ != L'\0');
|
return s;
|
}
|
|
First the function has to find the end of the string currently in the
|
array S. The ‘strchr’ call does this very efficiently since a
|
requirement for multibyte character representations is that the NUL byte
|
is never used except to represent itself (and in this context, the end
|
of the string).
|
|
After initializing the state object the loop is entered where the
|
first task is to make sure there is enough room in the array S. We
|
abort if there are not at least ‘MB_CUR_LEN’ bytes available. This is
|
not always optimal but we have no other choice. We might have less than
|
‘MB_CUR_LEN’ bytes available but the next multibyte character might also
|
be only one byte long. At the time the ‘wcrtomb’ call returns it is too
|
late to decide whether the buffer was large enough. If this solution is
|
unsuitable, there is a very slow but more accurate solution.
|
|
…
|
if (len < MB_CUR_LEN)
|
{
|
mbstate_t temp_state;
|
memcpy (&temp_state, &state, sizeof (state));
|
if (wcrtomb (NULL, *ws, &temp_state) > len)
|
{
|
/* We cannot guarantee that the next
|
character fits into the buffer, so
|
return an error. */
|
errno = E2BIG;
|
return NULL;
|
}
|
}
|
…
|
|
Here we perform the conversion that might overflow the buffer so that
|
we are afterwards in the position to make an exact decision about the
|
buffer size. Please note the ‘NULL’ argument for the destination buffer
|
in the new ‘wcrtomb’ call; since we are not interested in the converted
|
text at this point, this is a nice way to express this. The most
|
unusual thing about this piece of code certainly is the duplication of
|
the conversion state object, but if a change of the state is necessary
|
to emit the next multibyte character, we want to have the same shift
|
state change performed in the real conversion. Therefore, we have to
|
preserve the initial shift state information.
|
|
There are certainly many more and even better solutions to this
|
problem. This example is only provided for educational purposes.
|