codehaus


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Hbase state backend in Flink


Hi!

While certainly possible I think it’s a bad idea in general.

I think state size itself shouldn’t be a problem with the RocksDb backend
as you can always increase parallelism to shard more while keeping the
insanely good performance compared to a remote kv store. We and other users
have successfully used rocksdb state backend with incremental snapshots
with several terabytes of state in production for years.

The only main advantage I see for hbase and similar kvstores as
statebackend is the instant recovery you get but even in that case you
probably want an implementation that combines an embedded and remote kv
store.

Also the rocskdb backend without any external dependency will be infinitely
more reliable in practice.

Cheers
Gyula

On Thu, 27 Dec 2018 at 17:17, Naveen Kumar
<naveenkumar.g@xxxxxxxxxxxx.invalid> wrote:

> Hi,
>
> I am exploring if we can plugin hbase as state backend in Flink. We have
> need for streaming jobs with large window states, high throughput and
> reliability.
>
> I wanted to know if implementing Flink backend in Hbase or other
> distributed KV store is possible. Any documentation or pointers will be
> helpful.
>
> Thanks,
> Naveen
>