标签:c++编程 是什么 rac 注意 ade some ldb memory named
使用现代C++如何避免bugs(上)
How to avoid bugs using modern C++
In C++, the keywords auto and decltype were added. Of course, you already know how they work.
在C++中,增加了关键字auto和decltype。当然,你已经知道它们是怎么工作的了。
|
auto
//C++98: std::map<int, int>::iterator it = m.find(42); |
It‘s very convenient to shorten long types, without losing the readability of the code. However, these keywords become quite expansive, together with templates: there is no need to specify the type of the returning value with auto and decltype.
在不损失代码可读性的情况下,缩短长类型非常方便。但是,这些关键字和模板扩展性很强:不需要使用auto和decltype指定返回值的类型。
But let‘s go back to our topic. Here is an example of a 64-bit error:
|
unsigned if
|
In a 64-bit application, the value of string::npos is greater than the maximum value of UINT_MAX, which can be represented by a variable of unsigned type. It could seem that this is a case where auto can save us from this kind of problem: the type of the n variable isn‘t important to us, the main thing is that it can accommodate all possible values of string::find. And indeed, if we rewrite this example with auto, the error is gone:
在64位应用程序中,string::npos的值大于UINT_MAX的最大值,该值可以由无符号类型的变量表示。这似乎是一个auto可以将我们从这类问题中拯救出来的例子:n变量的类型对我们来说并不重要,主要的是它可以容纳string::find的所有可能值。实际上,如果我们用auto重写这个示例,错误就消失了:
|
auto if
|
But not everything is as simple. Using auto is not a panacea, and there are many pitfalls related to its use. For example, you can write the code like this:
但并不是所有事情都那么简单。使用auto并不是万能的,它的使用也存在许多陷阱。例如,您可以编写如下代码:
|
auto
char |
Auto won‘t save us from the integer overflow and there will be less memory allocated for the buffer than 5GiB.
Auto also isn‘t of any great help when it comes to a very common error: an incorrectly written loop. Let‘s look at an example:
Auto不会将我们从整数溢出中拯救出来,而且为缓冲区分配的内存将少于5GiB。
当遇到一个非常常见的错误:一个写得不正确的循环时,Auto也没有任何帮助。让我们看一个例子:
|
for
|
For large size arrays, this loop becomes an infinity loop. It‘s no surprise that there are such errors in the code: they reveal themselves in very rare cases, for which there were no tests.
Can we rewrite this fragment with auto?
对于大型数组,此循环变为无限循环。代码中存在这样的错误并不奇怪:它们在非常罕见的情况下会暴露自己,因为没有测试。
我们能用auto重写这个片段吗?
|
for
|
No. Not only is the error is still here. It has become even worse.
With simple types auto behaves very badly. Yes, in the simplest cases (auto x = y) it works, but as soon as there are additional constructions, the behavior can become more unpredictable. What‘s worse, the error will be more difficult to notice, because the types of variables aren‘t that obvious at first glance. Fortunately it is not a problem for static analyzers: they don‘t get tired, and don‘t lose attention. But for us, as simple mortals it‘s better to specify the types explicitly. We can also get rid of the narrowing casting using other methods, but we‘ll speak about that later.
不,不仅错误还在这里。情况变得更糟了。
对于简单类型,auto的行为非常糟糕。是的,在最简单的情况下(auto x=y),它可以工作,但是一旦有了额外的构造,行为就会变得更加不可预测。更糟糕的是,这个错误更难被注意到,因为变量的类型乍一看就不那么明显。幸运的是,对于静态分析器来说这不是一个问题:它们不会感到疲倦,也不会失去注意力。但对我们来说,简单来说,最好显式地指定类型。我们也可以使用其他方法来消除缩小范围的情况,但稍后我们将讨论这个问题。
One of the "dangerous" types in C++ is an array. Often when passing it to the function, programmers forget that it is passed as a pointer, and try to calculate the number of elements with sizeof.
危险计数
C++中的一种“危险”类型是数组。通常在将它传递给函数时,程序员会忘记它是作为指针传递的,并尝试使用sizeof计算元素的数量。
|
#define RTL_NUMBER_OF_V1(A) (sizeof(A)/sizeof((A)[0]))
#define _ARRAYSIZE(A) RTL_NUMBER_OF_V1(A)
int
|
Note: This code is taken from the Source Engine SDK.
PVS-Studio warning: V511 The sizeof() operator returns size of the pointer, and not of the array, in ‘sizeof (iNeighbors)‘ expression. Vrad_dll disp_vrad.cpp 60
Such confusion can arise because of specifying the size of an array in the argument: this number means nothing to the compiler, and is just a hint to the programmer.
The trouble is that this code gets compiled, and the programmer is unaware that something is not right. The obvious solution would be to use metaprogramming:
注意:此代码取自源引擎SDK。
VS Studio警告:V511 size of()运算符返回“sizeof(iNeighbors)”表达式中指针的大小,而不是数组的大小。虚拟现实动态链接库Vrad_dll disp_vrad.cpp 60个
这种混淆可能是由于在参数中指定数组的大小而引起的:这个数字对编译器来说没有任何意义,对程序员来说只是一个警示。
问题是这段代码被编译了,而程序员却不知道有些事情不对。显而易见的解决方案是使用元编程:
|
template
|
If we pass to this function, not an array, we get a compilation error. In C ++17 you can use std::size.
In C++11, the function std::extent was added, but it isn‘t suitable as countof, because it returns 0 for inappropriate types.
如果我们传递给这个函数,而不是一个数组,就会得到一个编译错误。在C++17中,可以使用std::size。
在C++ 11中,函数std::extent被添加,但是它不匹配作为countof,因为它返回不匹配类型的0。
std::extent<
decltype(iNeighbors)>();
//=> 0
You can make an error not only with countof, but with sizeof as well.
|
|
Note: This code is taken from Chromium.
PVS-Studio warnings:
As you can see, the standard C++ arrays have a lot of problems. This is why you should use std::array: in the modern C++ its API is similar to std::vector and other containers, and it‘s harder to make an error when using it.
注:此代码取自Chromium。
PVS-Studio警告:
•V511 size of()运算符返回“sizeof(salt)”表达式中指针的大小,而不是数组的大小。浏览器visitedlink_master.cc 968。
•V512调用“memcpy”函数将导致缓冲区“salt_”下溢。浏览器visitedlink_master.cc 968。正如你所看到的,标准的C++阵列有很多问题。这就是为什么你应该使用STD::SART:在现代C++中,它的API类似于STD::vector和其他容器,使用它时出错更难。
|
void
|
One more source of errors is a simple for loop. You may think, "Where can you make a mistake there? Is it something connected with the complex exit condition or saving on the lines of code?" No, programmers make error in the simplest loops. Let‘s take a look at the fragments from the projects:
|
const
|
Note: This code is taken from Haiku Operation System.
PVS-Studio warning: V706 Suspicious division: sizeof (kBaudrates) / sizeof (char *). Size of every element in ‘kBaudrates‘ array does not equal to divisor. SerialWindow.cpp 162
We have examined such errors in detail in the previous chapter: the array size wasn‘t evaluated correctly again. We can easily fix it by using std::size:
|
const
|
But there is a better way. Let‘s take a look at one more fragment.
|
inline
|
Note: This code is taken from Shareaza.
PVS-Studio warning: V547 Expression ‘nCharPos >= 0‘ is always true. Unsigned type value is always >= 0. BugTrap xmlreader.h 946
It‘s a typical error when writing a reverse loop: the programmer forgot that the iterator of an unsigned type and the check always return true. You might think, "How come? Only novices and students make such mistakes. We, professionals, don‘t." Unfortunately, this is not completely true. Of course, everyone understands that (unsigned >= 0)- true. Where do such errors come from? They often occur as a result of refactoring. Imagine this situation: the project migrates from the 32-bit platform to 64-bit. Previously, int/unsigned was used for indexing and a decision was made to replace them with size_t/ptrdiff_t. But in one fragment they accidentally used an unsigned type instead of a signed one.
What shall we do to avoid this situation in your code? Some people advise the use of signed types, as in C# or Qt. Perhaps, it could be a way out, but if we want to work with large amounts of data, then there is no way to avoid size_t.Is there any more secure way to iterate through array in C++? Of course there is. Let‘s start with the simplest one: non-member functions. There are standard functions to work with collections, arrays and initializer_list; their principle should be familiar to you.
|
char for
|
Great, now we do not need to remember the difference between a direct and reverse cycle. There is also no need to think about whether we use a simple array or an array - the loop will work in any case. Using iterators is a great way to avoid headaches, but even that is not always good enough. It is best to use the range-based for loop:
|
char for
|
Of course, there are some flaws in the range-based for: it doesn‘t allow flexible management of the loop, and if there is more complex work with indexes required, then for won‘t be of much help to us. But such situations should be examined separately. We have quite a simple situation: we have to move along the items in the reverse order. However, at this stage, there are already difficulties. There are no additional classes in the standard library for range-based for. Let‘s see how it could be implemented:
|
template struct
template
|
In C++14 you can simplify the code by removing the decltype. You can see how auto helps you write template functions - reversed_wrapper will work both with an array, and std::vector.
Now we can rewrite the fragment as follows:
|
char for
|
What‘s great about this code? Firstly, it is very easy to read. We immediately see that the array of the elements is in the reverse order. Secondly, it‘s harder to make an error. And thirdly, it works with any type. This is much better than what it was.
You can use boost::adaptors::reverse(arr) in boost.
But let‘s go back to the original example. There, the array is passed by a pair pointer-size. It is obvious that our idea with reversed will not work for it. What shall we do? Use classes like span/array_view. In C++17 we have string_view, and I suggest using that:
|
void
|
string_view does not own the string, in fact it‘s a wrapper around the const char* and the length. That‘s why in the code example, the string is passed by value, not by the reference. A key feature of the string_view is compatibility with strings in various string presentations: const char*, std::string and non-null terminated const char*.
As a result, the function takes the following form:
|
inline
|
Passing to the function, it‘s important to remember that the constructor string_view(const char*) is implicit, that‘s why we can write like this:
Foo(pChars);
Not this way:
Foo(wstring_view(pChars, nNumChars));
A string that the string_view points to, does not need to be null- terminated, the very name string_view::data gives us a hint about this, and it is necessary to keep that in mind when using it. When passing its value to a function from cstdlib, which is waiting for a C string, you can get undefined behavior. You can easily miss it, if in most cases that you are testing, there is std::string or null-terminated strings used.
Let‘s leave C++ for a second and think about good old C. How is security there? After all, there are no problems with implicit constructor calls and operators, or type conversion, and there are no problems with various types of the strings. In practice, errors often occur in the simplest constructions: the most complicated ones are thoroughly reviewed and debugged, because they cause some doubts. At the same time programmers forget to check simple constructions. Here is an example of a dangerous structure, which came to us from C:
|
enum
enum
int
|
An example of the Linux kernel. PVS-Studio warning: V556 The values of different enum types are compared: switch(ENUM_TYPE_A) { case ENUM_TYPE_B: ... }. libiscsi.c 3501
Pay attention to the values in the switch-case: one of the named constants is taken from a different enumeration. In the original, of course, there is much more code and more possible values and the error isn‘t so obvious. The reason for that is lax typing of enum - they may be implicitly casting to int, and this leaves a lot of room for errors.
In C++11 you can, and should, use enum class: such a trick won‘t work there, and the error will show up at the compilation stage. As a result, the following code does not compile, which is exactly what we need:
|
enum
enum
int
|
The following fragment is not quite connected with the enum, but has similar symptoms:
|
void
|
Note: This code is taken from ReactOS.
Yes, the values of errno are declared as macros, which is bad practice in C++ (in C as well), but even if the programmer used enum, it wouldn‘t make life easier. The lost comparison will not reveal itself in case of enum (and especially in case of a macro). At the same time enum class would not allow this, as there will were no implicit casting to bool.
But back to the native C++ problems. One of them reveals when there is a need to initialize the object in the same way in several constructors. A simple situation: there is a class, two constructors, one of them calls another. It all looks pretty logical: the common code is put into a separate method - nobody likes to duplicate the code. What‘s the pitfall?
|
|
Note: This code is taken from LibreOffice.
PVS-Studio warning: V603 The object was created but it is not being used. If you wish to call constructor, ‘this->Guess::Guess(....)‘ should be used. guess.cxx 56
The pitfall is in the syntax of the constructor call. Quite often it gets forgotten, and the programmer creates one more class instance, which then gets immediately destroyed. That is, the initialization of the original instance isn‘t happening. Of course, there are 1001 ways to fix this. For example, we can explicitly call the constructor via this, or put everything into a separate function:
|
|
By the way, an explicit repeated call of the constructor, for example, via this is a dangerous game, and we need to understand what‘s going on. The variant with the Init() is much better and clearer. For those who want to understand the details of these "pitfalls" better, I suggest looking at chapter 19, "How to properly call one constructor from another", from this book.
But it is best to use the delegation of the constructors here. So we can explicitly call one constructor from another in the following way:
|
|
Such constructors have several limitations. First: delegated constructors take full responsibility for the initialization of an object. That is, it won‘t be possible to initialize another class field with it in the initialization list:
|
|
And of course, we have to make sure that the delegation doesn‘t create a loop, as it will be impossible to exit it. Unfortunately, this code gets compiled:
|
|
标签:c++编程 是什么 rac 注意 ade some ldb memory named
原文地址:https://www.cnblogs.com/wujianming-110117/p/13198526.html