Обсуждение: AW: Re: New Linux xfs/reiser file systems

Поиск
Список
Период
Сортировка

AW: Re: New Linux xfs/reiser file systems

От
Zeugswetter Andreas SB
Дата:
> > I think it's worth noting that Oracle has been petitioning the
> > kernel developers for better raw device support: in other words,
> > the ability to write directly to the hard disk and bypassing the
> > filesystem all together.   
> 
> But there could be other reasons why Oracle would want to do 
> raw stuff.

The reasons are: 
1. Most Unixen now have shared (between several machines) raw devicesOracle needs this for their shared everything
ParallelServer. Only 2 Unixen that I know of have shared filesystems (IBM gpfs and Sun Veritas) (both are rather new)
 
2. The allocation time for raw devices is by far better (near instantaneous) thancreating preallocated files in a fs.
Providing1 Tb of raw devices is a task of minutes, creating 1 Tb filsystems with preallocated 2 Gb files is a task of
hoursat best.
 
3. absolute control over writes and page location (you don't want interleaved pages)
4. Efficient use of buffer memory. Usual use of filesystems buffers the disk pages twice,one copy in the db buffer
pool,one in the OS file cache.
 
5. async raw IO (most Unixes provide async raw IO on raw devices, only some provide raw IO on filesystem files).(async
IOhas 2 advantages: CPU work can be done while waiting for IO and IO can complete within one OS timeslice (20 us). This
ispossible with modern disk systems, that have large caches)
 

Andreas


Re: AW: Re: New Linux xfs/reiser file systems

От
Giles Lean
Дата:
> 2. The allocation time for raw devices is by far better (near
>     instantaneous) than creating preallocated files in a
>     fs. Providing 1 Tb of raw devices is a task of minutes,
>     creating 1 Tb filsystems with preallocated 2 Gb files is a
>     task of hours at best.

Filesystem dependent, surely?  Veritas' VxFS can create filesystems
quickly, and quickly preallocate space for the files.  If you actually
want to write data into the files that would take longer. :)

Creating a 1TB UFS filesystem might take a while, and UFS doesn't
support pre-allocation of space as far as I know so creating 2GB files
would take time too.  Perhaps hours. :-(

> 3. absolute control over writes and page location (you don't want
> interleaved pages)

As well as a filesystem, most large systems I'm familiar with use
volume management software (VxVM, LVM, ...) and their "disks" will be
allocated space on disk arrays.

These additional layers aren't arguments against simplifying the
filesystem layer, but they sure will complicate measurement and
tuning. :-)

Regards,

Giles