登录查看更多内容

?? Recreate printf() in C

Paul Stuart

ServiceNow | CIS-CSA | CIS-ITSM | CIS-HRSD | ITIL-4 | Integrations | JavaScript |

发布日期: 2024年3月19日

This article will be a full break down of a project I undertook at the 42 Adelaide coding school. It is a re-build of the printf() function, which is under the standard input/output library in the C programming language. The purpose of doing such a task is to really learn two key things; firstly how memory is handled at a low level in programming, and secondly how to handle/manipulate data types, what ones are useful in specific contexts and why. I am also a firm believer that if you can teach it, then you know it. It's the best way to learn in my opinion, which is why I like to share.

To view the full source code, have a look at my 42-Adelaide repository on GitHub with printf() and other projects being added regularly: https://github.com/codingPaulStuart/42-Adelaide-Core

This is the standard printf() function below, comes from the C library. To recreate it, it has to follow some standards and rules to get the same result.

#include <stdio.h>
 
int main()
{
    printf("Hello Print F, let's make this from scratch!");
    return 0;
}

?? Must have the following capabilities:

Cannot implement the buffer management of the original printf().
The function has to handle the following conversions: cspdiuxX%
%c Prints a single character.
%s Prints a string (as defined by the common C convention).
%p The void * pointer argument has to be printed in hexadecimal format.
%d Prints a decimal (base 10) number.
%i Prints an integer in base 10.
%u Prints an unsigned decimal (base 10) number.
%x Prints a number in hexadecimal (base 16) lowercase format.
%X Prints a number in hexadecimal (base 16) uppercase format.
%% Prints a percent sign.

?? High Level View

The image above shows the entry point to the program and how components are modularized, essentially unless there is no format specifier parsed into the printf() function, it will simply output the string. Where the variations occur are when a format specifier is given (u, d, p, etc.) the ft_type function acts as a switchboard to distribute the parameter to the correct function based on the specifier. See table below for reference on the types of formatting that can be used in the printf() function.

To further illustrate how the functions and files interact, the image below is a sequence diagram for a simple use case, printing '12' to the standard output. From the first actor to the left, the parameter is parsed through a number of functions. Notice there is a lot of recursive function calling, this is because often there will be a need to get the character down to a single digit so it can be printed 1 by 1, and also tracked by a counter, which is passed through all the functions as a pointer.

Sequence Diagram showing use case for printing '12'

To look at the printf function in a simple way, if there is no format specifier, then simply output the characters. Otherwise, based on the specifier, call the relevant function to handle the different use cases. See table below for function breakdown:

Below is the entry point for printf and you can see the conditional checks and calling to other important utility functions in the program.

The ft_type function is the main switchboard checking the format specifier and calling the relevant function to handle that use case. Any printf usage with the %u, %p, %i etc will go through this switchboard.

?? Strings

For this function it is one of the more simple cases as it has no format specifier, using a utility function (ft_putchar), which will feature extensively throughout the printf function, calls the write function from the standard library, prints a character to the standard output (1) and assigns 1 byte.

?? Integers?

First thing to explain looking at the number length function is what size_t data type is. The?size_t?type is an unsigned integer type that is used to represent sizes and is guaranteed to be big enough to contain the size of the largest possible object on the system. It is used here because the length of a number cannot be negative, and it is a standard type for sizes and counts in C.

The number length function is used to calculate the number of characters that would be needed to represent the integer as a string. The size_t data type is needed here because we must account for every possible size and range. Also notice the division of 10 is important so you can determine how many digits the number has. Each iteration of the loop reduces the number by one decimal place.

calculate the number of characters for output

The ft_putnbr_fd takes the number being printed, and the file descriptor of 1, which is the standard output. This function is part of another custom library built in the 42 school as a project. You can view the full libft library on my GitHub repository as well.

The putnum file descriptor from libft library is used for handling edge cases. Represented by the validation check for the value =?-2147483648, which is the minimum value for a 32-bit signed int. This is an important check in the first if statement, because the absolute value of –2147483648 cannot be represented in a 32-bit signed integer, it can only be printed as a string representation.

A continual theme that runs through all the printf functions I have remade involves the use of recursion, meaning the function will actually call itself. This is often needed because to output the characters they have to be done one by one, which means breaking them down continuously until there is only one character.

You can see this on two occasions in the function, for the second recursive call towards the bottom, If the number is not negative and is 10 or greater, the function first recursively calls itself by dividing by 10, effectively printing all but the last digit. Then calculates the last digit by taking the number, dividing by 10, and converting it to ASCII character.

领英推荐

Call methods from the .NET Class Library using C#…

Ferdinand Charles 1 年前

String Manipulations. Introduction to Programming. C#…

Ferdinand Charles 1 年前

A Short History for a Magnificent Programming Language

Alexander Alves 1 年前

Remember any single digit + '0' is really adding 48, which will give the ASCII value.

?? Pointers

Three functions are used for outputting pointers. We need to calculate the length of the string, set the prefix of '0x' before the hex representation, then finally output the hex representation of the pointer using recursion again, and also finally using the standard output with the write function.

Notice this time with the recursion we divide and get the difference of out input by 16, because of the base 16 hexadecimal values (0-9 and A-F).

Before exploring the functions, it's important to understand a few things about data types and why they are used for these following functions. Printing a pointer address requires a data type very large, capable of storing large integer values that are positive only. This is also to prevent OVERFLOW where the data type is incapable of representing the value, causing truncation and other unwanted results. Pointers often require the largest data type in the hierarchy, such as void*, uintptr_t, and long long. They will ALWAYS be unsigned as they are non-negative values.

Printp prints the '0x' prefix followed by the hexadecimal representation of a pointer. If the value is not 0, then putptr is used to output the hexadecimal digits of the pointer, incrementing the counter. This is done using the ptrlen function. Notice when the ft_putptr is invoked there is no need to type cast. This is because moving from a smaller data type to a larger one will implicitly type cast.??It happens automatically in compile time because the data type is being promoted up the data hierarchy, not down.

Putptr prints the hexadecimal representation of a pointer. Recursively calling itself dividing by 16 (base-16 hex). This is to get down to a digit under 16, then a single digit for output. The data type uintptr_t?is designed specifically for storing a data pointer, and it is used for pointer arithmetic and to hold memory addresses. The size range of a uintptr_t is also guaranteed to be sufficient to hold a pointer.

Print Length function calculates the length of the string that would represent a hexadecimal value of a pointer, and size_t used to represent the size of an object, it allows the code to be compatible across different platforms. This data type is important because we are working with hexadecimal digits, which represent binary data (0s, and 1’s). Again the division of '16' is used because that is the base-16 hexadecimal range.

? Unsigned Integers

This function is much more straightforward and requires less processing power as it is expecting a non-negative integer. Main difference with ft_printu and ft_putnbr_fd is that ft_printu does not have to check for negative values and handle edge cases. This shows the simplicity of ft_printu, and contrasts the complexity of ft_putnbr_fd which must handle negative values.

Again, notice the recursion methodology to get the ASCII value, it must be single digit, validation for greater than 9 means recursively calling printu. The difference after dividing by 10 is what is then parsed to ft_putchar, adding '0' (48) gives the ascii value.

? Hexadecimals

Firstly, what are hex values, why are they used? The hexadecimal numbers are used whenever the binary representation is important to the context. It allows us to see when one byte ends and one byte begins.?Hexadecimal is a more convenient and readable format for binary data, rather than a long list of 1’s and 0’s. A great example of this is when working with memory addresses and network addresses, hexadecimal data allows addresses to be separated into their components.

See example below for using hex values to represent the binary numbers of an IP address, it is referenced from the base-16 table to see how they are represented. The breakdown for calculating the hex value for the number '192', which is 'C0' can be explained through continual division by 2.

Binary numbers are calculated by dividing the number by 2 until you get 0. Reading the remainders in reverse gives the binary number, so ‘192’ is calculated by: 192/2 – 96 (remainder 0), 96/2 = 48, 48 ÷ 2 = 24 (remainder 0), 24 ÷ 2 = 12 (remainder 0), 12 ÷ 2 = 6 (remainder 0), 6 ÷ 2 = 3 (remainder 0), 3 ÷ 2 = 1 (remainder 1), 1 ÷ 2 = 0 (remainder 1). So in reverse, combine all the remainders (00000011) then reverse them = 11000000.

Print Hex Function is printing an unsigned int in hexadecimal format, and tracking the number of characters printed. If the value is not 0, the call the puthex function to print the hexadecimal digits of the number and increment the counter by the length of the numbers hexadecimal representation (hexlen function).

The hex length function calculates the length of the hexadecimal representation of an unsigned integer. Initializes size_t to count the number of hexadecimal digits. The while loop divides the value by 16 (base 16 hex) and increments total until value is 0. it then returns the total count of hexadecimal digits.

These are the key functions used in my recreation of printf in the C programming language. You can see that it is important to understand the nature and context of different data types in making these sorts of functions, and handle any edge-cases.

Thanks for taking the time to read, please share/comment if you found it useful! ??

Dmitry Demirkylych

Code writer & pancakes lover

12 个月

Cool, clean and straightforward ????

1 次回应

Louise Vidal (Nobes)

Founder of '42' schools in Australia | social disrupter | multi award winner | Entrepreneurship Embassador Seaton High School

1 年

Loved seeing you back on campus!

1 次回应

查看更多评论

要查看或添加评论，请登录

Paul Stuart的更多文章

Implementing HRSD in ServiceNow

2025年2月25日

Implementing HRSD in ServiceNow

The Human Resources Management Problem Human resource management in many organisations face the same issue of time…

1 条评论
Governance, Risk, & Compliance ServiceNow

2024年9月26日

Governance, Risk, & Compliance ServiceNow

Imagine, if you will, a small child with a toy box in their room. For those of you that have raised or looked after…

2 条评论
ITIL 4 Concepts

2024年4月11日

ITIL 4 Concepts

ITIL, or Information Technology Infrastructure Library, is a well-known set of IT best practices designed to assist…
Software Asset Management in ServiceNow ??

2024年2月22日

Software Asset Management in ServiceNow ??

Company asset management that is physical and obvious will get looked after and prioritized especially if it is vital…

8 条评论
Identification & Reconciliation Discovery in ServiceNow

2024年2月8日

Identification & Reconciliation Discovery in ServiceNow

If we were to think of a football game, we may use a variety of techniques and strategies to try and kick the ball in…

2 条评论
Discovery Fundamentals

2024年1月17日

Discovery Fundamentals

Challenges for an organizations ERP system includes identifying all the most important devices, assets, and critical…

9 条评论
Table Mapping from External API

2023年12月22日

Table Mapping from External API

Normally in ServiceNow, often the standard way to handle an external data source writing to tables in the instance, is…

2 条评论
Scripted REST API in ServiceNow

2023年12月11日

Scripted REST API in ServiceNow

Scenario Let's say we are running a ServiceNow instance which is used for a variety of purposes in an organization…

2 条评论
Glide Ajax - ServiceNow

2023年12月4日

Glide Ajax - ServiceNow

Article accompanying video on how to use Glide Ajax in ServiceNow. In my last post I demonstrated how Glide Ajax can be…

1 条评论

See all articles

?? Recreate printf() in C