nih-gov/www.ncbi.nlm.nih.gov/mailman/pipermail/cpp-announce/2003/000080.html

592 lines
22 KiB
HTML

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
<HTML>
<HEAD>
<TITLE> [C++ Toolkit ANNOUNCE] NCBI C++ Toolkit Release (April 2, 2003)
</TITLE>
<LINK REL="Index" HREF="index.html" >
<LINK REL="made" HREF="mailto:cpp-announce%40ncbi.nlm.nih.gov?Subject=%5BC%2B%2B%20Toolkit%20ANNOUNCE%5D%20NCBI%20C%2B%2B%20Toolkit%20Release%20%28April%202%2C%202003%29&In-Reply-To=">
<META NAME="robots" CONTENT="index,nofollow">
<META http-equiv="Content-Type" content="text/html; charset=us-ascii">
<LINK REL="Previous" HREF="000079.html">
<LINK REL="Next" HREF="000081.html">
</HEAD>
<BODY BGCOLOR="#ffffff">
<H1>[C++ Toolkit ANNOUNCE] NCBI C++ Toolkit Release (April 2, 2003)</H1>
<!--htdig_noindex-->
<B>Denis Vakatov</B>
<A HREF="mailto:cpp-announce%40ncbi.nlm.nih.gov?Subject=%5BC%2B%2B%20Toolkit%20ANNOUNCE%5D%20NCBI%20C%2B%2B%20Toolkit%20Release%20%28April%202%2C%202003%29&In-Reply-To="
TITLE="[C++ Toolkit ANNOUNCE] NCBI C++ Toolkit Release (April 2, 2003)">vakatov at ncbi.nlm.nih.gov
</A><BR>
<I>Wed Apr 16 17:08:26 EDT 2003</I>
<P><UL>
<LI>Previous message: <A HREF="000079.html">[C++ Toolkit ANNOUNCE]
Public pre-Release of the C++ Toolkit sources (More on the topic)
</A></li>
<LI>Next message: <A HREF="000081.html">[C++ Toolkit ANNOUNCE] FYI:: GCC-3.2.3 released
</A></li>
<LI> <B>Messages sorted by:</B>
<a href="date.html#80">[ date ]</a>
<a href="thread.html#80">[ thread ]</a>
<a href="subject.html#80">[ subject ]</a>
<a href="author.html#80">[ author ]</a>
</LI>
</UL>
<HR>
<!--/htdig_noindex-->
<!--beginarticle-->
<PRE>The newest release of the NCBI C++ Toolkit is available at:
<A HREF="ftp://ftp.ncbi.nih.gov/toolbox/ncbi_tools++/CURRENT/">ftp://ftp.ncbi.nih.gov/toolbox/ncbi_tools++/CURRENT/</A>
It includes tarballs for UNIX, MS-Window and Darwin.
A large (and yet incomplete) list of significant changes in the
Toolkit's API and functionality can be found in:
<A HREF="ftp://ftp.ncbi.nih.gov/toolbox/ncbi_tools++/CURRENT/RELEASE_NOTES">ftp://ftp.ncbi.nih.gov/toolbox/ncbi_tools++/CURRENT/RELEASE_NOTES</A>
The current version of RELEASE_NOTES goes here:
#############################################################################
NCBI C++ Toolkit Release (April 2, 2003)
#############################################################################
*** DOWNLOAD
<A HREF="ftp://ftp.ncbi.nih.gov/toolbox/ncbi_tools++/Apr_2_2003/">ftp://ftp.ncbi.nih.gov/toolbox/ncbi_tools++/Apr_2_2003/</A>
#############################################################################
*** CONTENT
Source code archives:
ncbi_cxx_unix.tar.gz -- for UNIX'es (see the list of UNIX flavors below)
ncbi_cxx_mac_cw.sit -- for MacOS 10.X / CodeWarrior 8.0 Update 8.3
ncbi_cxx_macosx.tar.gz -- for MacOS 10.X / GCC 3.1
ncbi_cxx_win.exe -- for MS-Windows / MSVC++ 6.0 (self-extracting)
ncbi_cxx_win.zip -- for MS-Windows / MSVC++ 6.0
Other:
RELEASE_NOTES -- this file
timestamp -- when the sources were checked out of the CVS
#############################################################################
*** NEW DEVELOPMENTS -- LIBRARIES
------------------------------------------------------------------
+++++ CoreLib +++++ (corelib)
1. Redesigned CRef&lt;&gt;:
A) constructor made explicit,
B) const CRef&lt;&gt; now returns const reference to the object.
2. Remplemented CPipe, enabled it for &quot;windows&quot; subsystem under MS-Windows.
3. Multiple fixes and extentions in the NStr:: string manipulation functions.
------------------------------------------------------------------
+++++ Data streaming, Networking, and Dispatching +++++ (connect)
1. Implemented C++ API for sockets (based on the &quot;C&quot; SOCK API).
2. &quot;C&quot; SOCK API to support UDP (datagram) sockets.
3. Put in a draft implementation of C++ datagram socket API.
4. These two went to the 'util' library:
A) CStreamUtils::Readsome() -- a portable substitute to
'istream::readsome' method, which also tries its best to
behave reasonably when reading from a non-blocking source.
B) CStreamUtils::Pushback() -- to put an arbitrary block of data
&quot;back&quot; to the standard C++ 'istream'.
------------------------------------------------------------------
+++++ Database Connectivity (DBAPI) +++++ (dbapi)
1. Adapted FreeTDS version 8 (see LICENSE file to use it) to be built
as a part of the Toolkit (FreeTDS driver -- on UNIX only).
2. Added projects to build DBAPI drivers as DLLs on MS-Win/MSVC++:
CTLib, DBLib, MSDBLib, and ODBC.
------------------------------------------------------------------
+++++ CGI/FastCGI Framework +++++ (dbapi)
1. Improved and extended logging and statistics.
2. Added &quot;Cookie affinity&quot; support.
3. Allow for a graceful break of FastCGI loop using a &quot;watch file&quot;.
4. CCgiApp -- added new methods GetFCgiIteration() and IsFastCGI().
------------------------------------------------------------------
+++++ Data Serialization +++++ (serial)
1. XML streams now support serialization of objects which are generated by
a DTD specification (see also +Datatool+ below).
2. Added type-info structures and functions for CTime.
------------------------------------------------------------------
+++++ Object Manager +++++ (objects/objmgr)
1. Reimplemented CSeqMap class for working with sequence maps.
A) Performance improved significantly for large segmented sequences.
B) Sequence map iterator CSeqMap_CI implemented for efficient browsing
and iteration of sequence map segments.
2. Reimplemented CSeqVector class for working with sequence data.
A) Sequence map iterator CSeqMap_CI was used for efficient access to
data segments.
B) Performance improved significantly by efficient use of inlined methods.
3. Rewrote several inefficient methods of CHandleRange and
CHandleRangeMap classes.
4. Changed in CScope:
A) Optimized communication with CDataSource by adding several
data/index caches.
B) Added priorities to data sources.
5. Rewrote CFeat_CI, CAlign_CI and CGraph_CI:
A) Performance of feature iterator increased more then order of magnitude:
a) CFeat_CI and CGraph_CI return temporary objects instead of mapped
CSeq_feat or CSeq_graph,
b) Feature gathering and location mapping is performed in one pass,
c) Added several cached indexes,
d) Soring of features is performed at the end of feature gathering
inside resulting vector&lt;&gt;,
e) Highly optimized feature sorting function.
B) Implemented lots of addditional tuning flags for feature iterator like:
a) Several level of sequence map selection for features gathering,
b) Several level of restriction of feature source location,
c) Different sorting methods.
6. Added top level Seq-entry iterator to CScope (CTSE_CI).
7. Added possibility to remove or replace annotations in seq-entries indexed
by CDataSource.
8. Added CBioseq_Handle::GetComplexityLevel() method returning Seq-entry of
required complexity.
9. Renamed test applications and adjusted makefiles to match the source
file names. Added a script and adjusted makefiles to run loader-related
tests through different database interfaces (ID1 and PubSeq).
------------------------------------------------------------------
+++++ Alignment Manager +++++ (objects/alnmgr)
1. Classes for fast and easy access to the segments or chunks of an
alignment (CAlnMap and related) and the the base pairs of each sequence
(CAlnVec).
2. Classes (CAlnMix and related) for constructing a virtual multiple
alignment out of a set of input alignments.
NOTE:
All AlnMgr functionality is based upon the Seq-align and
Dense-seg ASN.1 specifications.
------------------------------------------------------------------
+++++ Alignment Algorithms +++++ (algo)
A new library called `algo' has been introduced with a purpose to
facilitate computationally intensive tasks. The library resides in a
top-level Toolkit tree branch and is built with speed optimization on.
This first release of the library includes several alignment algorithms
widely used by biologists:
- CNWAligner:
The Needleman-Wunsch algorithm producing pairwise global alignments of
nucleotide or protein sequences. The implementation uses affine penalty
model and optionally supports end space free alignments.
- CMMAligner:
Derived from CNWAligner, this class encapsulates the Hirschberg's
divide-and-conquer algorithm (also credited to Myers and Miller) under
which the amount of space required to run the NW becomes a linear
function of sequence's lengths. While the latter is achieved at a
cost of lower performance, a parallel version of the algorithm which
is also provided can run even faster than the classical NW in a
multiple-CPU environment.
- CNWAlignerMrna2Dna:
As it is apparent from its name, this sort of algorithm is specifically
designed for making spliced alignments. The algorithm calculates global
alignment specially accounting for splice signals in its dynamic
programming recurrences resulting in better alignments for these
particular types of sequences.
A sample program is provided to demonstrate the usage of the library,
though it is also profound enough to be used as a standalone application.
------------------------------------------------------------------
+++++ Object Validator +++++ (objects/validator)
A complete implementation of a C++ Validator is now available (excluding
validation of alignments). The test_validator application can validate
Seq-entry, Seq-submit and standalone annotation records. It can run in a
batch mode to handle NCBI release files.
------------------------------------------------------------------
+++++ Flat-File Generator +++++ (objects/flat)
There is a new version of flat-file generator, with support for some
additional output formats (EMBL, GBSeq, and tabular as well as GenBank).
NOTE: neither version produces canonical output; use the C Toolkit for that.
------------------------------------------------------------------
+++++ Miscellaneous Libs (additions and improvements) +++++
1. `seqset' (objects/seqset)
-- [CSeq_entry] Code to read sequences from FASTA files.
2. `seqloc' (objects/seqloc)
-- [CSeq_loc] Caching of heavily used CSeq_loc::GetTotalRange() method.
3. `seqfeat' (objects/??????)
-- [CSeq_feat] Optimized feature sorting methods.
4. `util' (util)
-- [logrotate] Class for automatically-rotated log files.
-- [strsearch] Fast string search utilities using Boyer-Moore algorithm
and a finite state automat search.
-- [range] Optimized CRange&lt;&gt; and CRangeMap&lt;&gt; classes.
5. `xobjutil' (objects/util)
-- [sequence] a) Converting coordinates between corresponding source and
product locations on a feature, and generally determining
one location's position relative to another.
b) Class CCdregion_translate for rapid translation
from a given genetic code, allowing all of the iupac
nucleotide ambiguity characters.
c) Added TestForOverlap() and GetBestOverlappingFeat().
-- [genbank] Optimized GenBank formatter.
6. `regexp' (util/regexp)
-- PCRE library (see LICENSE) has been embedded into the Toolkit.
7. `html' (html)
-- Added code to support Sergey Kurdin's popup menu.
#############################################################################
*** NEW DEVELOPMENTS -- APPLICATIONS and TOOLS
------------------------------------------------------------------
+++++ NCBI Genome Workbench (GBench) +++++ (gui/gbench, gui/core, etc)
There is a fledgling GBench project which matures rapidly; it
however has its own release schedule so far (although it is a part of
the Toolkit). To get more info about the GBench:
<A HREF="http://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/tools/gbench/gbench.html">http://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/tools/gbench/gbench.html</A>
or contact GBench project leader, Mike DiCuccio.
&quot;GBench: NCBI Genome Workbench is an application for visualization of
molecule information (sequences, alignment, molecule features, etc).&quot;
------------------------------------------------------------------
+++++ Datatool +++++ (serial/datatool)
1. Can now generate serialization code based on a DTD specification.
While it does not yet support all of DTD features, the most
widely used features are implemented, and new can be added upon request.
2. Can now generate simple clients for RPC-style ASN.1- or XML-based
network services, and has been configured to do so for
NCBI's ID1, Entrez2, and Medline Archive services.
#############################################################################
*** BUILD FRAMEWORK
------------------------------------------------------------------
+++++ Parallel build w/GMAKE +++++
gmake -jN should now work properly (though the build system still
doesn't support cross-project parallelization with any version of
make).
------------------------------------------------------------------
+++++ MacOS 10.X / CodeWarrior 8.0 Update 8.3 +++++
Projects include targets to compile with BSD/Apple headers and libraries or
with MSL headers and libraries.
------------------------------------------------------------------
+++++ MS-Win/MSVC project tree +++++
1. Added project &quot;all_objects_generation&quot; to auto-generate
serialization source code &quot;on-the-fly&quot; during the build.
NOTE: so, we do not have to pack pre-generated serialization code
to the MS-Win/MSVC distribution archive anymore.
2. Amended the code to be built as DLLs.
3. Created new workspace (and underlying projects) to build &quot;cluster DLLs&quot;
(i.e. DLLs which implement APIs corresponding to several static libs).
#############################################################################
*** DOCUMENTATION
It has been mostly frozen lately due to its conversion from HTML to
XML format. The reformatted version of the docs will replace the old
one soon.
There is also an interesting development underway -- to use the
DOXYGEN source browser to complement the traditional docs with
structured DOXIGEN-generated &quot;Library Reference&quot; ones based
on the in-source comments.
#############################################################################
*** BACKWARD COMPATIBILITY
There have been quite a lot of API changes since the October release,
mostly in the code generation for serializable objects and in the
ObjectManager related APIs since the October release.
------------------------------------------------------------------
+++++ Exceptions +++++
There is an effort underway to back-fit the Toolkit with
hierarchically organized exception classes, so now some parts of
the code throw these new exceptions rather than e.g. &quot;runtime_error&quot;.
Still, the new exception class hierarchy is derived from &quot;std::exception&quot;.
For more details, see APPENDIX &quot;BACKWARD_COMPATIBILITY.Exception&quot; attached.
------------------------------------------------------------------
+++++ &quot;External&quot; packages used +++++
The code has been adapted to work with the newer version of
the NCBI C Toolkit and some 3rd party packages:
FLTK -- 1.1.3
wxWin -- 2.4.0
FCGI -- 2.2.0
------------------------------------------------------------------
+++++ Miscellaneous +++++
1. Changed names for the static-lib versions DBAPI drivers on MS Windows
(added suffix &quot;_static&quot;, like: dbapi_driver_odbc_static.lib)
------------------------------------------------------------------
+++++
So, be prepared to encounter and fight code backward-compatibility issues.
#############################################################################
*** PLATFORMS (OS's, compilers used inside NCBI)
------------------------------------------------------------------
+++++ UNIX +++++
Linux, INTEL, GCC 3.0.4
Linux, INTEL, ICC 7.0
Solaris, SPARC, WorkShop 6 update 2 C++ 5.3 Patch 111685-13
Solaris, SPARC, GCC 2.95.2
Solaris, SPARC, GCC 3.0.4
Solaris, INTEL, WorkShop 6 update 2 C++ 5.3 Patch 111686-13
Solaris, INTEL, GCC 2.95.2
IRIX64, SGI-Mips, MIPSpro 7.3.1.2m
FreeBSD, INTEL, GCC 3.0.4
Tru64, ALPHA, GCC 2.95.3
Tru64, ALPHA, Compaq C V6.3-029 / Compaq C++ V6.5-014
------------------------------------------------------------------
+++++ MS Windows +++++
MSVC++ 6.0 Service Pack 5.....
------------------------------------------------------------------
+++++ MacOSX +++++
MacOS 10.1, GCC 3.1 (patched, see &quot;doc/config_darwin.html)
MacOS 10.2, GCC 3.1 (patched, see &quot;doc/config_darwin.html)
MacOS 10.X, CodeWarrior 8.0 Update 8.3
#############################################################################
*** CAVEATS
------------------------------------------------------------------
+++++ MacOS 10.2 / GCC 3.1 +++++
Toolkit builds okay in all modes, but it runs okay only in Debug, non-DLL mode.
Also, GBench requires DYLD_BIND_AT_LAUNCH to be set. Otherwise it will not
run (will hang at startup waiting for semaphore).
These are GCC-related issues and Apple aware of them. The next OS release
(expected in September, with a preview available in June) is supposed to
address most of them and introduce many others, since they will be switching
to 64 bit hardware.
------------------------------------------------------------------
+++++ MacOS 10.X / CodeWarrior 8.0 Update 8.3 +++++
1. Not all of the applications are built.
2. GUI-related projects are not built.
------------------------------------------------------------------
+++++ GCC 3.2 +++++
This version has a bug (fixed in 3.2.1) in the C++ iostream implementation,
so some of the code that relies on C++ iostreams (such as serialization
and iostream wrappers for network and other types of connections, as
well as ObjectManager GenBank loader) would not work. Advice: upgrade.
#############################################################################
*** ETC
1. The GCC/X11 configuration on MacOS X behaves (builds and runs)
just like a regular UNIX:
ncbi_cxx_macosx.tar.gz -&gt; ncbi_cxx_unix.tar.gz
=============================================================================
=============================================================================
APPENDIX
#############################################################################
BACKWARD_COMPATIBILITY: Exceptions
1.
We are continuing with the effort to make the C++ Toolkit exceptions
more structured -- creating a library/module based hierarchy of
exceptions, such as:
CException
CCoreException
CFileException
CMutexException
CStringException
..........
CConnectException
..........
..........
CXxxException
..........
..........
..........
..........
..........
1.A) One reason to do this is to allow upper-level user to easily
identify (in his code) which exactly library or module has
thrown the exception, and what was the particular reason for throwing it.
This is really necessary sometimes.
1.B) Another reason to do this is to make all exceptions used
(and thrown from) inside the C++ Toolkit to have more common
functionality (as described and implemented in the base &quot;CException&quot;
class).
2.
As a consequence of this development, &quot;C{Errno,Parse}Exception&quot; will be
eliminated whatsoever. We realize that it may create some backward
compatibility problems for some of your code which relies on these,
but:
2.A) All problems related to the elimination of &quot;C{Errno,Parse}Exception&quot;
will show up early (during the code compilation), and they can be fixed
rather quickly -- you just start catching the new, lib/mod-specific
exceptions (which BTW, still provide the &quot;C{Errno,Parse}Exception&quot; API,
see [2.C] below).
2.B) Having such generic, library/module-independent
&quot;C{Errno,Parse}Exception&quot; around does not do one any good service
as they provide no information on their origin and specific error
(except in the err.message, but it's neither reliable nor easy or
obvious to extract the needed info from the messages as they are
intended to be used for logging, and not to be parsed by some
wild-guessing ad hoc message-parsing code).
2.C) There are templates CErrnoTemplException&lt;&gt; and CParseTemplException&lt;&gt;
to allow the creation of exceptions which would have this specific
(&quot;errno&quot; and &quot;parse&quot; oriented) APIs, and yet belong to the right
place in the C++ Toolkit hierarchy.
#############################################################################
$Date: 2003/04/16 20:57:20 $
#############################################################################
</PRE>
<!--endarticle-->
<!--htdig_noindex-->
<HR>
<P><UL>
<!--threads-->
<LI>Previous message: <A HREF="000079.html">[C++ Toolkit ANNOUNCE]
Public pre-Release of the C++ Toolkit sources (More on the topic)
</A></li>
<LI>Next message: <A HREF="000081.html">[C++ Toolkit ANNOUNCE] FYI:: GCC-3.2.3 released
</A></li>
<LI> <B>Messages sorted by:</B>
<a href="date.html#80">[ date ]</a>
<a href="thread.html#80">[ thread ]</a>
<a href="subject.html#80">[ subject ]</a>
<a href="author.html#80">[ author ]</a>
</LI>
</UL>
<hr>
<a href="http://www.ncbi.nlm.nih.gov/mailman/listinfo/cpp-announce">More information about the cpp-announce
mailing list</a><br>
<!--/htdig_noindex-->
</body></html>