libs (traits | iterators)
is_sortedAdd the methods is_sorted, is_sorted_by and is_sorted_by_key to [T];
add the methods is_sorted and is_sorted_by to Iterator.
In quite a few situations, one needs to check whether a sequence of elements is sorted. The most important use cases are probably unit tests and pre-/post-condition checks.
The lack of an is_sorted() function in Rust's standard library has led to
countless programmers implementing their own.
While it is possible to write a one-liner using iterators (e.g.
(0..arr.len() - 1).all(|i| arr[i] <= arr[i + 1])¹), it is still unnecessary
mental overhead while writing and reading the code.
In the corresponding issue on the main repository (from which a few comments are referenced) everyone seems to agree on the basic premise: we want such a function.
Having is_sorted() and friends in the standard library would:
Another proof of this functions' usefulness is the inclusion in the
standard library of many other languages:
C++'s std::is_sorted,
Go's sort.IsSorted,
D's std.algorithm.sorting.is_sorted
and others. (Curiously, many (mostly) more high-level programming language –
like Ruby, Javascript, Java, Haskell and Python – seem to lack such a function.)
¹ In the initial version of this RFC, this code snippet contained a bug
(< instead of <=). This subtle mistake happens very often: in this RFC,
in the discussion thread about this RFC,
in this StackOverflow answer
and in many more places. Thus, avoiding this common bug is another good
reason to add is_sorted().
Lastly, it is possible to implement is_sorted for many common types with SIMD
instructions which improves speed significantly. It is unlikely that many
programmers will take the time to write SIMD code themselves, thus everyone
would benefit if this rather difficult implementation work is done in the
standard library.
Possible documentation of the two new methods of Iterator as well as
[T]::is_sorted_by_key:
fn is_sorted(self) -> bool where Self::Item: PartialOrd,Checks if the elements of this iterator are sorted.
That is, for each element
aand its following elementb,a <= bmust hold. If the iterator yields exactly zero or one element,trueis returned.Note that if
Self::Itemis onlyPartialOrd, but notOrd, the above definition implies that this function returnsfalseif any two consecutive items are not comparable.Example
assert!([1, 2, 2, 9].iter().is_sorted()); assert!(![1, 3, 2, 4).iter().is_sorted()); assert!([0].iter().is_sorted()); assert!(std::iter::empty::<i32>().is_sorted()); assert!(![0.0, 1.0, std::f32::NAN].iter().is_sorted());
fn is_sorted_by<F>(self, compare: F) -> bool where F: FnMut(&Self::Item, &Self::Item) -> Option<Ordering>,Checks if the elements of this iterator are sorted using the given comparator function.
Instead of using
PartialOrd::partial_cmp, this function uses the givencomparefunction to determine the ordering of two elements. Apart from that, it's equivalent tois_sorted; see its documentation for more information.
(for
[T])fn is_sorted_by_key<F, K>(&self, f: F) -> bool where F: FnMut(&T) -> K, K: PartialOrd,Checks if the elements of this slice are sorted using the given key extraction function.
Instead of comparing the slice's elements directly, this function compares the keys of the elements, as determined by
f. Apart from that, it's equivalent tois_sorted; see its documentation for more information.Example
assert!(["c", "bb", "aaa"].is_sorted_by_key(|s| s.len())); assert!(![-2i32, -1, 0, 3].is_sorted_by_key(|n| n.abs()));
The methods [T]::is_sorted and [T]::is_sorted_by will have analogous
documentations to the ones shown above.
This RFC proposes to add the following methods to [T] (slices) and
Iterator:
impl<T> [T] {
fn is_sorted(&self) -> bool
where
T: PartialOrd,
{ ... }
fn is_sorted_by<F>(&self, compare: F) -> bool
where
F: FnMut(&T, &T) -> Option<Ordering>,
{ ... }
fn is_sorted_by_key<F, K>(&self, f: F) -> bool
where
F: FnMut(&T) -> K,
K: PartialOrd,
{ ... }
}
trait Iterator {
fn is_sorted(self) -> bool
where
Self::Item: PartialOrd,
{ ... }
fn is_sorted_by<F>(mut self, compare: F) -> bool
where
F: FnMut(&Self::Item, &Self::Item) -> Option<Ordering>,
{ ... }
}
In addition to the changes shown above, the three methods added to [T] should
also be added to core::slice::SliceExt as they don't require heap
allocations.
To repeat the exact semantics from the prior section: the methods return
true if and only if for each element a and its following element b, the
condition a <= b holds. For slices/iterators with zero or one element,
true is returned. For elements which implement PartialOrd, but not Ord,
the function returns false if any two consecutive elements are not
comparable (this is an implication of the a <= b condition from above).
A sample implementation can be found here.
It increases the size of the standard library by a tiny bit.
Iterator, but not to [T]Without is_sorted() defined for slices directly, one can still fairly easily
test if a slice is sorted by obtaining an iterator via iter(). So instead of
v.is_sorted(), one would need to write v.iter().is_sorted().
This always works for is_sorted() because of the PartialOrd blanket impl
which implements PartialOrd for all references to an PartialOrd type. For
is_sorted_by it would introduce an additional reference to the closures'
arguments (i.e. v.iter().is_sorted_by(|a, b| ...)) where a and b are
&&T).
While these two inconveniences are not deal-breakers, being able to call those
three methods on slices (and all Deref<Target=[T]> types) directly, could be
favourable for many programmers (especially given the popularity of slice-like
data structures, like Vec<T>). Additionally, the sort method and friends
are defined for slices, thus one might expect the is_sorted() method there,
too.
LinkedList) as wellAdding these methods to every data structure in the standard libary is a lot of
duplicate code. Optimally, we would have a trait that represents sequential
data structures and would only add is_sorted and friends to said trait. We
don't have such a trait as of now; so Iterator is the next best thing. Slices
deserve special treatment due to the reasons mentioned above (popularity and
sort()).
Iterator::while_sorted, is_sorted_until, sorted_prefix, num_sorted, ...In the issue on the main repository,
concerns about completely consuming the iterator were raised. Some alternatives,
such as while_sorted,
were suggested. However, consuming the iterator is neither uncommon nor a
problem. Methods like count(), max() and many more consume the iterator,
too. One comment mentions:
I am a bit skeptical of the equivalent on Iterator just because the return value does not seem actionable -- you aren't going to "sort" the iterator after you find out it is not already sorted. What are some use cases for this in real code that does not involve iterating over a slice?
As mentioned above, Iterator is the next best thing to a trait representing
sequential data structures. So to check if a LinkedList, VecDeque or
another sequential data structure is sorted, one would simply call
collection.iter().is_sorted(). It's likely that this is the main usage for
Iterator's is_sorted methods. Additionally, code like
if v.is_sorted() { v.sort(); } is not very useful: sort() already runs in
O(n) for already sorted arrays.
Suggestions like is_sorted_until are not really useful either: one can easily
get a subslice or a part of an iterator (via .take()) and call is_sorted()
on that part.
Iterator::is_sorted_by_key be added as well?This RFC proposes to add is_sorted_by_key only to [T] but not to
Iterator. The latter addition wouldn't be too useful since once could easily
achieve the same effect as .is_sorted_by_key(...) by calling
.map(...).is_sorted(). It might still be favourable to include said function
for consistency and ease of use. The standard library already hosts a number of
sorting-related functions all of which come in three flavours: raw, _by and
_by_key. By now, programmers could expect there to be an is_sorted_by_key
as well.
std::cmp::is_sorted insteadAs suggested here,
one could also add this free function (plus the _by and _by_key versions)
to std::cmp:
fn is_sorted<C>(collection: C) -> bool
where
C: IntoIterator,
C::Item: Ord,
This can be seen as a better design as it avoids the question about which data
structure should get is_sorted methods. However, it might have the
disadvantage of being less discoverable and also less convenient (long path or
import).
Ord instead of only PartialOrdAs proposed in this RFC, is_sorted only requires its elements to be
PartialOrd. If two non-comparable elements are encountered, false is
returned. This is probably the only useful way to define the function for
partially orderable elements.
While it's convenient to call is_sorted() on slices containing only
partially orderable elements (like floats), we might want to use the stronger
Ord bound:
is_sorted on something will probably make
most programmers think, that calling sort on the same thing is possible,
too. Having different bounds for is_sorted and sort thus might lead to
confusion.is_sorted_by function currently uses a closure which returns
Option<Ordering>. This differs from the closure for sort_by and looks a
bit more complicated than necessary for most cases.