可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
I have a set of domain objects that inherit from a shared type (i.e. GroupRecord extends Record
, RequestRecord extends Record
). The subtypes have specific properties (i.e. GroupRecord::getCumulativeTime
, RequestRecord::getResponseTime
).
Further, I have a list of records with mixed subtypes as a result of parsing a logfile.
List<Record> records = parseLog(...);
In order to compute statistics on the log records, I want to apply math functions only on a subset of the records that matches a specific subtype, i.e. only on GroupRecord
s. Therefore I want to have a filtered stream of specific subtypes. I know that I can apply a filter
and map
to a subtype using
records.stream()
.filter(GroupRecord.class::isInstance)
.map(GroupRecord.class::cast)
.collect(...
Apply this filter&cast on the stream multiple times (especially when doing it for the same subtype multiple times for different computations) is not only cumbersome but produces lots of duplication.
My current approach is to use a TypeFilter
class TypeFilter<T>{
private final Class<T> type;
public TypeFilter(final Class<T> type) {
this.type = type;
}
public Stream<T> filter(Stream<?> inStream) {
return inStream.filter(type::isInstance).map(type::cast);
}
}
To be applied to a stream:
TypeFilter<GroupRecord> groupFilter = new TypeFilter(GroupRecord.class);
SomeStatsResult stats1 = groupFilter.filter(records.stream())
.collect(...)
SomeStatsResult stats2 = groupFilter.filter(records.stream())
.collect(...)
It works, but I find this approach a bit much for such a simple task. Therefore I wonder, is there a better or what is the best way for making this behavior reusable using streams and functions in a concise and readable way?
回答1:
It depends on what do you find "more concise and readable". I myself would argue that the way you already implemented is fine as it is.
However, there is indeed a way to do this in a way that is slightly shorter from the point of where you use it, by using Stream.flatMap
:
static <E, T> Function<E, Stream<T>> onlyTypes(Class<T> cls) {
return el -> cls.isInstance(el) ? Stream.of((T) el) : Stream.empty();
}
What it would do is it will convert each original stream element to either a Stream
of one element if the element has expected type, or to an empty Stream
if it does not.
And the use is:
records.stream()
.flatMap(onlyTypes(GroupRecord.class))
.forEach(...);
There are obvious tradeoffs in this approach:
- You do lose the "filter" word from your pipeline definition. That may be more confusing that the original, so maybe a better name than
onlyTypes
is needed.
Stream
objects are relatively heavyweight, and creating so much of them may result in performance degradation. But you should not trust my word here and profile both variants under heavy load.
Edit:
Since the question asks about reusing filter
and map
in slightly more general terms, I feel like this answer can also discuss a little more abstraction. So, to reuse filter and map in general terms, you need the following:
static <E, R> Function<E, Stream<R>> filterAndMap(Predicate<? super E> filter, Function<? super E, R> mapper) {
return e -> filter.test(e) ? Stream.of(mapper.apply(e)) : Stream.empty();
}
And original onlyTypes
implementation now becomes:
static <E, R> Function<E, Stream<R>> onlyTypes(Class<T> cls) {
return filterAndMap(cls::isInstance, cls::cast);
}
But then, there is yet again a tradeoff: resulting flat mapper function will now hold captured two objects (predicate and mapper) instead of single Class
object in above implementation. It may also be a case of over-abstracting, but that one depends on where and why you would need that code.
回答2:
You don’t need an entire class to encapsulate a piece of code. The smallest code unit for that purpose, would be a method:
public static <T> Stream<T> filter(Collection<?> source, Class<T> type) {
return source.stream().filter(type::isInstance).map(type::cast);
}
This method can be used as
SomeStatsResult stats1 = filter(records, GroupRecord.class)
.collect(...);
SomeStatsResult stats2 = filter(records, GroupRecord.class)
.collect(...);
If the filtering operation isn’t always the first step in your chain, you may overload the method:
public static <T> Stream<T> filter(Collection<?> source, Class<T> type) {
return filter(source.stream(), type);
}
public static <T> Stream<T> filter(Stream<?> stream, Class<T> type) {
return stream.filter(type::isInstance).map(type::cast);
}
However, if you have to repeat this operation multiple times for the same type, it might be beneficial to do
List<GroupRecord> groupRecords = filter(records, GroupRecord.class)
.collect(Collectors.toList());
SomeStatsResult stats1 = groupRecords.stream().collect(...);
SomeStatsResult stats2 = groupRecords.stream().collect(...);
not only eliminating the code duplication in source code, but also performing the runtime type checks only once. The impact of the required additional heap space depends on the actual use case.
回答3:
WHAT you actually need is a Collector to collecting all elements in the stream that is instance of special type. It can solving your problem easily and avoiding filtering the stream twice:
List<GroupRecord> result = records.stream().collect(
instanceOf(GroupRecord.class, Collectors.toList())
);
SomeStatsResult stats1 = result.stream().collect(...);
SomeStatsResult stats2 = result.stream().collect(...);
AND you can do something as further like as Stream#map by using Collectors#mapping, for example:
List<Integer> result = Stream.of(1, 2L, 3, 4.)
.collect(instanceOf(Integer.class, mapping(it -> it * 2, Collectors.toList())));
| |
| [2,6]
[1,3]
WHERE you only want to consuming the Stream
once, you can easily composing the last Collector
as below:
SomeStatsResult stats = records.stream().collect(
instanceOf(GroupRecord.class, ...)
);
static <T, U extends T, A, R> Collector<T, ?, R> instanceOf(Class<U> type
, Collector<U, A, R> downstream) {
return new Collector<T, A, R>() {
@Override
public Supplier<A> supplier() {
return downstream.supplier();
}
@Override
public BiConsumer<A, T> accumulator() {
BiConsumer<A, U> target = downstream.accumulator();
return (result, it) -> {
if (type.isInstance(it)) {
target.accept(result, type.cast(it));
}
};
}
@Override
public BinaryOperator<A> combiner() {
return downstream.combiner();
}
@Override
public Function<A, R> finisher() {
return downstream.finisher();
}
@Override
public Set<Characteristics> characteristics() {
return downstream.characteristics();
}
};
}
Why did you need to composes Collectors?
Did you remember Composition over Inheritance Principle? Did you remember assertThat(foo).isEqualTo(bar) and assertThat(foo, is(bar)) in unit-test?
Composition is much more flexible, it can reuses a piece of code and composeing components together on runtime, that is why I prefer hamcrest
rather than fest-assert
since it can composing all possible Matcher
s together. and that is why functional programming is most popular since it can reuses any smaller piece of function code than class level reusing. and you can see jdk has introduced Collectors#filtering in jdk-9 that will make the execution routes shorter without losing its expressiveness.
AND you can refactoring the code above according to Separation of Concerns as further, then filtering
can be reused like as jdk-9 Collectors#filtering:
static <T, U extends T, A, R> Collector<T, ?, R> instanceOf(Class<U> type
, Collector<U, A, R> downstream) {
return filtering(type::isInstance, Collectors.mapping(type::cast, downstream));
}
static <T, A, R>
Collector<T, ?, R> filtering(Predicate<? super T> predicate
, Collector<T, A, R> downstream) {
return new Collector<T, A, R>() {
@Override
public Supplier<A> supplier() {
return downstream.supplier();
}
@Override
public BiConsumer<A, T> accumulator() {
BiConsumer<A, T> target = downstream.accumulator();
return (result, it) -> {
if (predicate.test(it)) {
target.accept(result, it);
}
};
}
@Override
public BinaryOperator<A> combiner() {
return downstream.combiner();
}
@Override
public Function<A, R> finisher() {
return downstream.finisher();
}
@Override
public Set<Characteristics> characteristics() {
return downstream.characteristics();
}
};
}